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FIELD OF THE INVENTION 

The present invention relates to the field of adaptive systems, and more particularly 
systems and methods which are adaptive to a human user input and/or a data environment, as 
20 well as applications for such systems and methods. More particularly, embodiments of the 

invention involve, for example, consumer electronics, personal computers, control systems, and 
professional assistance systems. 

BACKGROUND OF THE INVENTION 

The prior art is rich in various systems and methods for data analysis, as well as various 
25 systems and methods relating to useful endeavors. In general, most existing systems and 

methods provide concrete functions, which have a defined response to a defined stimulus. Such 
systems, while embodying the "wisdom" of the designer, have a particular shortcoming in that 
their capabilities are static. 

Intelligent or learning systems are also known. These systems are limited by the 
30 particular paradigm employed, and rarely are the learning algorithms general. In fact, while the 
generic theory and systems which learn are well known, the application of such systems to 
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particular problems requires both a detailed description of the problem, as well as knowledge of 
the input and output spaces. Even once these factors are known, a substantial tuning effort may 
be necessary to enable acceptable operation. 

Therefore, the present invention builds upon the prior art, which defines various problems 
to be addressed, intelligent systems and methods, tuning paradigms and user interfaces. 
Therefore, as set forth below, and in the attached appendix of references (including abstracts), 
incorporated herein by reference, a significant number of references detail fundamental 
technologies which may be improved according to the present invention, or incorporated together 
to form a part of the present invention. To the some extent, these technologies are disclosed and 
are expressly incorporated herein by reference to avoid duplication of prior art teachings. 
However, the disclosure herein is not meant to be limiting as to the knowledge of a person of 
ordinary skill in the art. Recitation hereinbelow of these teachings or reference to these teachings 
is not meant to imply that the inventors hereof were necessarily in any way involved in these 
references, nor that the particular improvements and claimed inventions recited herein were made 
or conceived after the publication of these references. Thus, prior art cited herein is intended to 
(1) disclose information related to the application published before the filing hereof; (2) define 
the problem in the art to which the present invention is directed, (3) define prior art methods of 
solving various problems also addressed by the present invention; (4) define the state of the art 
with respect to methods disclosed or referenced herein; and/or (5) detail technologies used to 
implement methods or apparatus in accordance with the present invention. 

HUMAN INTERFACE 

Aspects of the present invention provide an advanced user interface. The subject of man- 
machine interfaces has been studied for many years, and indeed the entire field of ergonomics 
and human factors engineering revolves around optimization of human-machine interfaces. 
Typically, the optimization scheme optimizes the mechanical elements of a design, or seeks to 
provide a universally optimized interface. Thus, a single user interface is typically provided for a 
system; In fact, some systems provide a variety of interfaces, for example, novice, intermediate 
and advanced, to provide differing balances between available control and presented complexity. 
Further, adaptive and/or responsive human-machine computer interfaces are now well known. 
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However, a typical problem presented is defining a self-consistent and useful (i.e., an 
improvement over a well-designed static interface) theory for altering the interface. Therefore, 
even where, in a given application, a theory exists, the theory is typically not generalizable to 
other applications. Therefore, one aspect of the present invention is to provide such a theory by 
which adaptive and/or responsive user interfaces may be constructed and deployed. 

In a particular application, the user interface according to the present invention is applied 
to general-purpose-type computer systems, for example, personal computers. One aspect of the 
present invention thus relates to a programmable device that comprises a menu-driven interface 
in which the user enters information using a direct manipulation input device. Such a type of 
interface scheme is disclosed in Verplank, William L., "Graphics in Human-Computer 
Communication: Principles of Graphical User-Interface Design", Xerox Office Systems. See the 
references cited therein: Foley, J.D., Wallace, V.L., Chan, P., "The Human Factor of Computer 
Graphics Interaction Techniques", IEEE CG&A, Nov. 1984, pp. 13-48; Koch, H, 
"Ergonomische Betrachtung von Schreibtastaturen", Humane Production, 1, pp. 12-15 (1985); 
Norman, D.A., Fisher, D., "Why Alphabetic Keyboards Are Not Easy To Use: Keyboard Layout 
Doesn't Much Matter", Human Factors 24(5), pp. 509-519 (1982); Perspectives: High 
Technology 2, 1985; Knowlton, K., "Virtual Pushbuttons as a Means of Person-Machine 
Interaction", Proc. of Conf. Computer Graphics, Pattern Recognition and Data Structure, Beverly 
Hills, California, May 1975, pp. 350-352; "Machine Now Reads, enters Information 25 Times 
Faster Than Human Keyboard Operators", Information Display 9, p. 18 (1981); "Scanner 
Converts Materials to Electronic Files for PCs", IEEE CG&A, Dec. 1984, p. 76; "New Beetle 
Cursor Director Escapes All Surface Constraints", Information Display 10, p. 12, 1984; Lu, C, 
"Computer Pointing Devices: Living With Mice", High Technology, Jan. 1984, pp. 61-65; 
"Finger Painting", Information Display 12, p. 18, 1981; Kraiss, K.F., "Neuere Methoden der 
Interaktion an der Schnittstelle Mensch-Maschine", Z.F. Arbeitswissenschaft, 2, pp. 65-70, 1978; 
Hirzinger, G., Landzettel, K., "Sensory Feedback Structures for Robots with Supervised 
Learning", IEEE Conf. on Robotics and Automation, St. Louis, March 1985; Horgan, H, 
"Medical Electronics", IEEE Spectrum, Jan. 1984, pp. 90-93. 



Hoffberg et al. 



-4- 



LIH-10 



A menu based remote control-contained display device is disclosed in Platte, Oberjatzas, 
and Voessing, "A New Intelligent Remote Control Unit for Consumer Electronic Device", IEEE 
Transactions on Consumer Electronics, Vol. CE-31, No. 1, February 1985, 59-68. 

A directional or direct manipulation-type sensor based infrared remote control is 
disclosed in Zeisel, Tomas, Tomaszewski, "An Interactive Menu-Driven Remote Control Unit 
for TV-Receivers and VC-Recorders", IEEE Transactions on Consumer Electronics, Vol. 34, No. 
3, 814-818 (1988), which relates to a control for programming with the West German Videotext 
system. This implementation differs from the Videotext programming system than described in 
Bensch, U., "VPV - VIDEOTEXT PROGRAMS VIDEORECORDER", IEEE Transactions on 
Consumer Electronics, Vol. 34, No. 3, 788-792 (1988), which describes the system of Video 
Program System Signal Transmitters, in which the VCR is programmed by entering a code for 
the Video Program System signal, which is emitted by television stations in West Germany. 
Each separate program has a unique identifier code, transmitted at the beginning of the program, 
so that a user need only enter the code for the program, and the. VCR will monitor the channel for 
the code transmission, and begin recording when the code is received, regardless of schedule 
changes. The Videotext Programs Recorder (VPV) disclosed does not intelligently interpret the 
transmission, rather the system reads the transmitted code as a literal label, without any analysis 
or determination of a classification of the program type. 

Known manual input devices include the trackball, mouse, and joystick. In addition, 
other devices are known, including the so-called "J-cursor" or "mousekey" which embeds a two 
(x,y) or three (x,y ,p) axis pressure sensor in a button conformed to a finger, present in a general 
purpose keyboard; a keyboard joystick of the type described in Electronic Engineering Times, 
October 28, 1991, p. 62, "IBM Points a New Way 1 '; a so-called "isobar" which provides a two 
axis input by optical sensors (0, x), a two and one half axis (x, y, digital input) input device, such 
as a mouse or a "felix" device, infrared, acoustic, etc.; position sensors for determining the 
position of a finger or pointer on a display screen (touch-screen input) or on a touch surface, e.g., 
"GlidePoint" (ALPS/Cirque); goniometer input (angle position, such as human joint position 
detector), etc. Many of such suitable devices are summarized in Kraiss, K. F. ? "Alternative Input 
Devices For Human Computer Interaction", Forschunginstitut Fur Anthropotecahnik, 
Werthhoven, F.R. Germany. Another device, which may also be suitable is the GyroPoint, 
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available from Gyration Inc., which provides 2-D or 3-D input information in up to six axes of 
motion: height, length, depth, roll, pitch and yaw. Such a device may be useful to assist a user in 
inputting a complex description of an object, by providing substantially more degrees of freedom 
sensing than minimally required by a standard graphic user interface. The many degrees of 
freedom available thus provide suitable input for various types of systems, such as "Virtual 
Reality" or which track a moving object, where many degrees of freedom and a high degree of 
input accuracy is required. The Hallpot, a device which pivots a magnet about a Hall effect 
sensor to produce angular orientation information, a pair of which may be used to provide 
information about two axes of displacement, available from Elweco, Inc, Willoughby, OH, may 
also be employed as an input device. 

User input devices may be broken down into a number of categories: direct inputs, i.e. 
touch-screen and light pen; indirect inputs, i.e. trackball, joystick, mouse, touch-tablet, bar code 
scanner (see, e.g., Atkinson, Terry, "VCR Programming: Making Life Easier Using Bar Codes"), 
keyboard, and multi-function keys; and interactive input, i.e. Voice activation/instructions (see, 
e.g., Rosch, Winn L., "Voice Recognition: Understanding the Master's Voice", PC Magazine, 
October 27, 1987, 261-308); and eye tracker and data suit/data glove (see, e.g. Tello, Ernest R., 
"Between Man And Machine", Byte, September 1988, 288-293; products of EXOS, Inc; Data 
Glove). Each of the aforementioned input devices has advantages and disadvantages, which are 
known in the art. 

Studies suggest that a "direct manipulation" style of interface has advantages for menu 
selection tasks. This type of interface provides visual objects on a display screen, which can be 
manipulated by "pointing" and "clicking" on the them. For example, the popular Graphical User 
Interfaces ("GUIs"), such as Macintosh and Microsoft Windows, and others known in the art, use 
a direct manipulation style interface. A device such as a touch-screen, with a more natural 
selection technique, is technically preferable to the direct manipulation method. However, the 
accuracy limitations and relatively high cost make other inputs more commercially practical. 
Further, for extended interactive use, touchscreens are not a panacea for office productivity 
applications. In addition, the user must be within arms' length of the touch-screen display. In a 
cursor positioning task, Albert (1982) found the trackball to be the most accurate pointing device 
and the touch-screen to be the least accurate when compared with other input devices such as the 
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light pen, joystick, data tablet, trackball, and keyboard. Epps (1986) found both the mouse and 
trackball to be somewhat faster than both the touch-pad and joystick, but he concluded that there 
were no significant performance differences between the mouse and trackball as compared with 
the touch-pad and joystick. 
5 It is noted that in text-based applications, an input device that is accessible, without the 

necessity of moving the user's hands from the keyboard, may be preferred. Thus, for example, 
Electronic Engineering Times (EET), October 28, 1991, p. 62, discloses a miniature joystick 
incorporated into the functional area of the keyboard. This miniature joystick has been 
successfully incorporated into a number of laptop computers. 
10 The following references are also relevant to the interface aspects of the present 

invention: 

Hoffberg, Linda I, "AN IMPROVED HUMAN FACTORED INTERFACE FOR 
PROGRAMMABLE DEVICES: A CASE STUDY OF THE VCR" Master's Thesis, Tufts 
University (Master of Sciences in Engineering Design, November, 1990). 
15 "Bar Code Programs VCR", Design News, February 1, 1988, 26. 

"How to find the best value in VCRs", Consumer Reports, March 1988, 135-141. 

"Low-Cost VCRs: More For Less", Consumer Reports, March 1990, 168-172. 

"Nielsen Views VCRs", Television Digest, June 23, 1988, 15. 

"The Highs and Lows of Nielsen Homevideo Index", Marketing & Media Decisions 
20 November 1985, 84-86+. 

"The Quest for 'User Friendly'", U.S. News & World Report, June 13, 1988. 54-56. 

"The Smart House: Human Factors in Home Automation", Human Factors in Practice, 
Dec. 1990, 1-36. 

"VCR, Camcorder Trends", Television Digest, Vol. 29:16 (March 20, 1989). 
25 "VCR's: A Look At The Top Of The Line", Consumer Reports, March 1989, 167-170. 

"VHS Videocassette Recorders", Consumer Guide, 1990, 17-20. 

Abedini, Kamran, "An Ergonomically-improved Remote Control Unit Design", Interface 
'87 Proceedings, 375-380. 

Abedini, Kamran, and Hadad, George, "Guidelines For Designing Better VCRs", Report 
30 No. IME 462, February 4, 1987. 

Bensch, U., "VPV - VIDEOTEXT PROGRAMS VIDEORECORDER", IEEE 
Transactions on Consumer Electronics, 34(3):788-792. 

Berger, Ivan, "Secrets of the Universals", Video, February 1989, 45-47+. 
Beringer, D.B., "A Comparative Evaluation of Calculator Watch Data Entry 
35 Technologies: Keyboards to Chalkboards", Applied Ergonomics, December 1985, 275-278. 

Bier, E. A. et al. "MMM: A User Interface Architecture for Shared Editors on a Single 
Screen," Proceedings of the ACM Symposium on User Interface Software and Technology Nov 
11-13, 1991, p. 79. 
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Bishop, Edward W., and Guinness, G. Victor Jr., "Human Factors Interaction with 
Industrial Design", Human Factors, 8(4):279-289 (August 1966). 

Brown, Edward, "Human Factors Concepts For Management". Proceedings of the Human 
Factors Society, 1973, 372-375. 

5 Bulkeley, Debra, "The Smartest House in America", Design News, October 19 1987 

56-61. 

Card, Stuart K., "A Method for Calculating Performance times for Users of Interactive 
Computing Systems", IEEE, 1979, 653-658. 

Carlson, Mark A., "Design Goals for an Effective User Interface". Electro/82 
10 Proceedings, 3/1/1-3/1/4. 

Carlson, Mark A., "Design Goals for an Effective User Interface", Human Interfacing 
with Instruments,. Session 3. 

Carroll, Paul B., "High Tech Gear Draws Cries of "Uncle", Wall Street Journal, April 27 
1988,29. 

15 Cobb, Nathan, "I don't get it", Boston Sunday Globe Magazine, March 25, 1990, 23-29. 

Davis, Fred, "The Great Look-and-Feel Debate", A+, 5:9-1 1 (July 1987). 
Dehning, Waltraud, Essig Heidrun, and Maass, Susanne, The Adaptation of Virtual 
Man-Computer Interfaces to User Requirements in Dialogs, Germany: Springer- Verlag, 1981. 
Ehrenreich, S.L., "Computer Abbreviations - Evidence and Synthesis", Human Factors 
20 27(2):143-155 (April 1985). 

Friedman, M.B., "An Eye Gaze Controlled Keyboard", Proceedings of the 2nd 
International Conference on Rehabilitation Engineering, 1984, 446-447. 

Giifoil, D., and Mauro, C.L., "Integrating Human Factors and Design: Matching Human 
Factors Methods up to Product Development", C.L. Mauro Assoc., Inc., 1-7. 
25 Gould, John D., Boies, Stephen J., Meluson, Antonia, Rasammy, Marwan, and Vosburgh, 

Ann Marie, "Entry and Selection Methods For Specifying Dates". Human Factors, 32(2) -199-214 
(April 1989). • 

Green, Lee, "Thermo Tech: Here's a common sense guide to the new thinking 
thermostats", Popular Mechanics, October 1985, 155-159. 
30 Grudin, Jonathan, "The Case Against User Interface Consistency", MCC Technical 

Report Number ACA-HI-002-89, January 1989. 

Harvey, Michael G., and Rothe, James T., "VideoCassette Recorders: . Their Impact on 
Viewers and Advertisers", Journal of Advertising, 25:19-29 (December/January 1985). 

Hawkins, William J., "Super Remotes", Popular Science, February 1989, 76-77. 
35 Henke, Lucy L., and Donohue, Thomas R., "Functional Displacement of Traditional TV 

Viewing by VCR Owners", Journal of Advertising Research, 29:18-24 (April-May 1989). 

Hoban, Phoebe, "Stacking the Decks", New York, February 16, 1987, 20:14. 

Howard, Bill, "Point and Shoot Devices", PC Magazine, 6:95-97 (August 1987). 

Jane Pauley Special, NBC TV News Transcript, July 17, 1990, 10:00 PM. 
40 Kolson, Ann, "Computer wimps drown in a raging sea of technology", The Hartford 

Courant, May 24, 1989, Bl. 

Kreifeldt, J.G., "A Methodology For Consumer Product Safety Analysis", The 3rd 
National Symposium on Human Factors in Industrial Design in Consumer Products, August 
1982,175-184. 
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Kreifeldt, John, "Human Factors Approach to Medical Instrument Design" Electro/82 
Proceedings, 3/3/1-3/3/6. 

Kuocheng, Andy Poing, and EUingstad, Vernon S., "Touch Tablet and Touch Input" 
Interface '87, 327. 

5 Ledgard, Henry, Singer, Andrew, and Whiteside, John, Directions in Human Factors for 

Interactive Systems, New York, Springer- Verlag, 1981. 

Lee, Eric, and MacGregor, James, "Minimizing User Search Time Menu Retrieval 
Systems", Human Factors, 27(2):157-162 (April 1986). 

Leon, Carol Boyd, "Selling Through the VCR", American Demographics, December 
10 1987,40-43. 

Long, John, "The Effect of Display Format on the Direct Entry of Numerical Information 
by Pointing", Human Factors, 26(1):3-17 (February 1984). 

Mantei, Marilyn M., and Teorey, Toby J., "Cost/Benefit Analysis for Incorporating 
Human Factors in the Software Lifecycle", Association for Computing Machinery, 1988. 
15 Meads, Jon A., "Friendly or Frivolous", Datamation, April 1, 1988, 98-100. 

Moore, T.G. and Dartnall, "Human Factors of a Microelectronic Product: The Central 
Heating Timer/Programmer", Applied Ergonomics, 1983, 13(l):15-23. 

Norman, Donald A., "Infuriating By Design", Psychology Today, 22(3):52-56 (March 

1988). 

20 Norman, Donald A., The Psychology of Everyday Things, New York, Basic Book, Inc 

1988. 

Platte, Hans-Joachim, Oberjatzas, Gunter, and Voessing, Walter, "A New Intelligent 
Remote Control Unit for Consumer Electronic Device", IEEE Transactions on Consumer 
Electronics, Vol. CE-31(l):59-68 (February 1985). 
25 Rogus, John G. and Armstrong, Richard, "Use of Human Engineering Standards in 

Design", Human Factors, 19(l):15-23 (February 1977). 

Rosch, Winn L., "Voice Recognition: Understanding the Master's Voice", PC Magazine, 
October 27, 1987, 261-308. 

Sarver, Carleton, "A Perfect Friendship", High Fidelity, 39:42-49 (May 1989). 
30 Schmitt, Lee, "Let's Discuss Programmable Controllers", Modern Machine Shop. May 

1987, 90-99. 

Schniederman, Ben, Designing the User Interface: Strategies for Effective 
Human-Computer Interaction, Reading, MA, Addison-Wesley, 1987. 

Smith, Sidney J., and Mosier, Jane N., Guidelines for Designing User Interface Software, 
35 Bedford, MA, MITRE, 1986. 

Sperling, Barbara Bied, Tullis Thomas S., "Are You a Better 'Mouser' or Trackballer'? A 
Comparison of Cursor - Positioning Performance", An Interactive/Poster Session at the 
CHI+GI'87 Graphics Interface and Human Factors in Computing Systems Conference. 

Streeter, L.A., Ackroff, J.M., and Taylor, G.A. "On Abbreviating Command Names", The 
40 Bell System Technical Journal, 62(6): 1807-1826 (July/August 1983). 

Swanson, David, and Klopfenstein, Bruce, "How to Forecast VCR Penetration", 
American Demographic, December 1987, 44-45. 

Tello, Emest R., "Between Man And Machine", Byte, September 1988. 288-293. 
Thomas, John, C, and Schneider, Michael L., Human Factors in Computer Systems, New 
45 Jersey, Ablex Publ. Co., 1984. 
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Trachtenberg, Jeffrey A., "How do we confuse thee? Let us count the ways" Forbes 
March 21, 1988, 159-160. 

Tyldesley, D.A., "Employing Usability Engineering in the Development of Office 
Products", The Computer Journal", 31(5):431-436 (1988). 

Verplank, William L., "Graphics in Human-Computer Communication: Principles of 
Graphical User-Interface Design", Xerox Office Systems. 

Voyt, Carlton F., "PLC's Learn New Languages", Design News, January 2, 1989, 78. 

Whitefield, A. "Human Factors Aspects of Pointing as an Input Technique in Interactive 
Computer Systems", Applied Ergonomics, June 1986, 97-104. 

Wiedenbeck, Susan, Lambert, Robin, and Scholtz, Jean, "Using Protocol Analysis to 
Study the User Interface", Bulletin of the American Society .for Information Science, June/July 
1989, 25-26.- 

Wilke, William, "Easy Operation of Instruments by Both Man and Machine". Electro/82 
Proceedings, 3/2/1-3/2/4. 

Yoder, Stephen Kreider, "U.S. Inventors Thrive at Electronics Show", The Wall Street 
Journal, January 10, 1990, Bl. 

Zeisel, Gunter, Tomas, Philippe, Tomaszewski, Peter, "An Interactive Menu-Driven 
Remote Control Unit for TV-Receivers and VC-Recorders", IEEE Transactions on Consumer 
Electronics, 34(3):814-818. 

AGENT TECHNOLOGIES 

Presently well known human computer interfaces include so-called agent technology, in 

which the computer interface learns a task defined (inherently or explicitly) by the user and 

subsequently executes the task. Such systems are available from Firefly (www.firefly.com), and 

are commercially present in some on-line commerce systems, such as Amazon.com 

(www.amazon.com). See: 

"ABI WHAP, Web Hypertext Applications Processor," 
http://alphabase.eom/abi3/whapinfo.html#profiling, (1996, Jul. 11). 

"AdForce Feature Set", http://www.imgis.com/index.html/core/p2-2html (1997, Apr. 

11). 

"EPRO," http://www.ipro.com/, Internet profiles Corporation Home and other Web Pages 
(1996, Jul. 11). 

"Media Planning is Redefined in a New Era of Online Advertising," PR Newswire, 
(1996, Feb. 5). 

"My Yahoo! news summary for My Yahoo! Quotes", http://my.yahoo.com, (1997, Jan. 

27). 

"NetGravity Announces Adserver 2.1", 
http://www.netgravity.com/news/pressrel/launch21.html (1997, Apr. 11). 

"Netscape & NetGravity: Any Questions?", http://www.netgravity.com/, (1996, Jul. 11). 

"Network Site Main", http://www.doubleclick.net/frames/general/nets2set.htm (1997, 
Apr. 11). 

"Real Media," http://www.realmedia.com/index.html, (1996, Jul. 11). 
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"The Front Page", http://live.excite.com/7aBb (1997, Jan. 27) and (1997, Apr. 11). 
"The PointCast Network," http:/www.pointcast.com/, (1996, Spring). 
"The Power of PenPoint", Can et al., 1991, p. 39, Chapter 13, pp. 258-260. 
"Welcome to Lycos," http://www.lycos.com, (1997, Jan. 27). 
5 Abatemarco, Fred, "From the Editor", Popular Science, Sep. 1992, p. 4 

Berniker, M., "Nielsen plans Internet Service," Broadcasting & Cable, 125(30):34 (1995 
Jul. 24). 

Berry, Deanne, et al. In an Apr. 10, 1990 news release, Symantec announced a new 
version of MORE (TM). 

10 Betts > M -. "Sentry cuts access to naughty bits," Computers and Security, vol 14 No 7 d 

615 (1995). ' " 

Boy, Guy A., Intelligent Assistant Systems, Harcourt Brace Jovanovich, 1991, uses the 
term "Intelligent Assistant Systems". 

Bussey, H.E., et al., "Service Architecture, Prototype Description, and Network 
15 Implications of a Personalized Information Grazing Service," IEEE Multiple Facets of 
Integration Conference Proceedings, vol. 3, No. Conf. 9, Jun. 3, 1990, pp. 1046-1053. 

Donnelley, J.E., "WWW media distribution via Hopewise Reliabe Multicast," Computer 
Networks and ISDN Systems, vol. 27, No. 6, pp. 81-788 (Apr., 1995). 

Edwards, John R., "Q&A: Integrated Software with Macros and an Intelligent Assistant", 
20 Byte Magazine, Jan. 1986, vol. 11, Issue 1, pp. 120-122, critiques the Intelligent Assistant by 
Symantec Corporation. 

Elofson, G. and Konsynski, B., "Delegation Technologies: Environmental Scanning with 
Intelligent Agents", Journal of Management Information Systems, Summer 1991, vol. 8. Issue 1, 
pp. 37-62. 

25 Garretson, R., "IBM Adds ^Drawing Assistant^ Design Tool to Graphics Series", PC 

Week, Aug. 13, 1985, vol. 2, Issue 32, p. 8. 

Gessler, S. and Kotulla A., "PDAs as mobile WWW browsers," Computer Networks and 
ISDN Systems, vol. 28, No. 1-2, pp. 53-59 (Dec. 1995). 

Glinert-Stevens, Susan, "Microsoft Publisher: Desktop Wizardry", PC Sources, Feb., 
30 1992, vol. 3, Issue 2, p. 357. 

Goldberg, Cheryl, "IBM Drawing Assistant: Graphics for the EGA", PC Magazine, Dec. 
24, 1985, vol. 4, Issue 26, p. 255. 

Hendrix, Gary G. and Walter, Brett A., "The Intelligent Assistant: Technical 
Considerations Involved in Designing Q&A's Natural-language Interface", Byte Magazine, Dec. 
35 1987, vol. 12, Issue 14, p. 251. 

Hoffman, D.L. et al., "A New Marketing Paradigm for Electronic Commerce," (1996, 
Feb. 19), http://www2000.ogsm.vanderbilt.edu novak/new.marketing.paradigm.html. 

Information describing BroadVision One-to-One Application System: "Overview," p. 1; 
Further Resources on One-To-One Marketing, p. 1; BroadVision Unleashes the Power of the 
40 Internet with Personalized Marketing and Selling, pp. 1-3; Frequently Asked Questions, pp. 1-3; 
Products, p. 1; BroadVision One-To-One(.TM.), pp. 1-2; Dynamic Command Center, p. 1; 
Architecture that Scales, pp. 1-2; Technology, pp. 1; Creating a New Medium for Marketing and 
Selling BroadVision One-To-One and the World Wide Web a White Paper, pp. 1-15; 
http://www.broadvision.com (1996, Jan.-Mar.). 
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Jones, R., "Digital's World-Wide Web server: A case study," Computer Networks and 
ISDN Systems, vol. 27, No. 2, pp. 297-306 (Nov. 1994). 

McFadden, M., "The Web and the Cookie Monster," Digital Age, (1996, Aug.). 

Nadoli, Gajanana and Biegel, John, "Intelligent Agents in the Simulation of 
Manufacturing Systems", Proceedings of the SCS Multiconference on AI and Simulation, 1989. 

Nilsson, B.A., "Microsoft Publisher is an Honorable Start for DTP Beginners", Computer 
Shopper, Feb. 1992, vol. 12, Issue 2, p. 426, evaluates Microsoft Publisher and Page Wizard. 

O'Connor, Rory J., "Apple Banking on Newton's Brain", San Jose Mercury News 
Wednesday, Apr. 22, 1992. 

Ohsawa, I. and Yonezawa, A., "A Computational Model of an Intelligent Agent Who 

Talks with a Person", Research Reports on Information Sciences. Series C, Apr 1989 No 92 
pp. 1-18. 

Pazzani, M. et al., "Learning from hotlists and coldlists: Towards a WWW Information 
Filtering and Seeking Agent," Proceedings International Conference on Tools with Artificial 
Intelligence, Jan. 1995, pp. 492-495. 

Poor, Alfred, "Microsoft Publisher", PC Magazine, Nov. 26, 1991, vol. 10, Issue 20, p. 
40, evaluates Microsoft Publisher. 

PRNewswire, information concerning the PointCast Network (PCN) (1996, Feb. 13) p. 

Raggett, D., "A review of the HTML + document format," Computer Networks and 
ISDN Systems, vol. 27, No. 2, pp. 35-145 (Nov. 1994). 

Rampe, Dan, et al. In a Jan. 9, 1989 news release, Claris Corporation announced two 
products, SmartForm Designer and SmartForm Assistant, which provide "Intelligent Assistance", 
such as custom help messages, choice lists, and data-entry validation and formatting. 

Ratcliffe, Mitch and Gore, Andrew, "Intelligent Agents take U.S. Bows.", MacWeek 
Mar. 2, 1992, vol. 6, No. 9, p. 1. 

Sharif Heger, A. and Koen, B. V., "KNOWBOT: an Adaptive Data Base Interface", 
Nuciear Science and Engineering, Feb. 1991, vol. 107, No. 2, pp. 142-157. 

Soviero, Marcelle M., "Your World According to Newton", Popular Science, Sep 1992 
pp. 45-49. " 

Upendra Shardanand, "Social Information Filtering for Music Recommendation" Sep. 
1994, pp. 1-93, Massachusetts Institute of Technology, Thesis. 

Weber, Thomas E., "Software Lets Marketers Target Web Ads," The Wall Street Journal. 
Apr. 21, 1997 

Weiman, Liza and Moran, Tom, "A Step toward the Future", Macworld, Aue. 1992 op 
129-131. 

Yan, T.W. and Garcia-Molina, H., "SIFT -A Tool for Wide-Area Information 
Dissemination," Paper presented at the USENIX Technical Conference, New Orleans LA (1995 
Jan.), pp. 177-186. 

INDUSTRIAL CONTROLS 

Industrial control systems are well known. Typically, a dedicated reliable hardware 
module controls a task using a conventional algorithm, with a low level user interface. These 
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devices are programmable, and therfore a high level software program may be provided to 
translate user instructions into the low level commands, and to analyze any return data. See, U.S. 
Patent No. 5,506,768, expressly incoporated herein by reference. See, also: 

A. B. Corripio, "Tuning of Industrial Control Systems", Instrument Society of America, 
5 Research Triangle Park, NC (1990) pp. 65-81. 

C. J. Harris & S. A. Billings, "Self-Tuning and Adaptive Control: Theory and 
Applications", Peter Peregrinus LTD (1981) pp. 20-33. 

C. Rohrer & Clay Nesler, "Self-Tuning Using a Pattern Recognition Approach", Johnson 
Controls, Inc., Research Brief 228 (Jun. 13, 1986). 
10 D. E. Seborg, T. F. Edgar, & D. A. Mellichamp, "Process Dynamics and Control", John 

Wiley & Sons, NY (1989) pp. 294-307, 538-541. 

E. H. Bristol & T. W. Kraus, "Life with Pattern Adaptation", Proceedings 1984 American 
Control Conference, pp. 888-892, San Diego, CA (1984). 

Francis Schied, "Shaum's Outline Series-Theory & Problems of Numerical Analysis", 
15 McGraw-Hill Book Co., NY (1968) pp. 236,237, 243, 244,261. 

K. J. Astrom and B. Wittenmark, "Adaptive Control", Addison-Wesley Publishing 
Company (1989) pp. 105-215. 

K. J. Astrom, T. Hagglund, "Automatic Tuning of PID Controllers", Instrument Society 
of America, Research Triangle Park, NC (1988) pp. 105-132. 
20 R. W. Haines, "HVAC Systems Design Handbook", TAB Professional and Reference 

Books, Blue Ridge Summit, PA (1988) pp. 170-177. 

S. M. Pandit & S. M. Wu, "Timer Series & System Analysis with Applications", John 
Wiley & Sons, Inc., NY (1983) pp. 200-205. 

T. W. Kraus 7 T. J. Myron, "Self-Tuning PID Controller Uses Pattern Recognition 
25 Approach", Control Engineering, pp. 106-111, Jun. 1984. 

V 

PATTERN RECOGNITION 

Another aspect of some embodiments of the invention relates to signal analysis and 
complex pattern recognition. This aspect encompasses analysis of any data set presented to the 
30 system: internal, user interface, or the environment in which it operates. While semantic, optical 
and audio analysis systems are known, the invention is by no means limited to these types of 
data. 

Pattern recognition involves examining a complex data set to determine similarities (in its 
broadest context) with other data sets, typically data sets which have been previously 
35 characterized. These data sets may comprise multivariate inputs, sequences in time or other 
dimension, or a combination of both multivariate data sets with multiple dimensions. 
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The following cited patents and publications are relevant to pattern recognition and 
control aspects of the present invention, and are herein expressly incorporated by reference: 

U.S. Patent 5,067,163, incorporated herein by reference, discloses a method for 
determining a desired image signal range from an image having a single background, in 
5 particular a radiation image such as a medical X-ray. This reference teaches basic image 
enhancement techniques. 

U.S. Patent 5,068,664, incorporated herein by reference, discloses a method and device 
for recognizing a target among a plurality of known targets, by using a probability based 
recognition system. This patent document cites a number of other references, which are relevant 
10 to the problem of image recognition: 

Appriou, A., "Interet des theories de l'incertain en fusion de donnees", Colloque 
International sur le Radar Paris, 24-28 avril 1989. 

Appriou, A., "Procedure d'aide a la decision multi-informateurs. Applications a la 
classification multi-capteurs de cibles", Symposium de l'Avionics Panel (AGARD) Turquie, 
15 25-29 avril 1988. 

Arrow, K. J., "Social choice and individual valves", John Wiley and Sons Inc. (1963). 

Bellman, R. E. ; L. A. Zadeh, "Decision making in a fuzzy environment", Management 
Science, 17(4) (December 1970). 

Bhatnagar, R. K., L. N. Kamal, "Handling uncertain information: a review of numeric 
20 and non-numeric methods", Uncertainty in Artificial Intelligence, L. N. Kamal and J. R Lemmer, 
Eds. (1986). 

Blair, D., R. Pollack, "La logique du choix collectif" Pour la Science (1983). 

Chao, J. J., E. Drakopoulos, C. C. Lee, "An evidential reasoning approach to distributed 
multiple hypothesis detection", Proceedings of the 20th Conference on decision and control, Los 
25 Angeles, Calif., December 1987. 

Dempster, A. P., "A generalization of Bayesian inference", Journal of the Royal 
Statistical Society, Vol. 30, Series B (1968). 

Dempster, A. P., "Upper and lower probabilities induced by a multivalued mapping", 
Annals of mathematical Statistics, no. 38 (1967). 
30 Dubois, D., "Modeles mathematiques de 1'imprecis et de l'incertain en vue duplications 

aux techniques d'aide a la decision", Doctoral Thesis, University of Grenoble (1983). 

Dubois, D., N. Prade, "Combination of uncertainty with belief functions: a 
reexamination", Proceedings 9th International Joint Conference on Artificial Intelligence, Los 
Angeles (1985). 

35 Dubois, D., N. Prade, "Fuzzy sets and systems-Theory and applications", Academic 

Press, New York (1980). 

Dubois, D., N. Prade, "Theorie des possibilites: application a la representation des 
connaissances en informatique", Masson, Paris (1985). 

Duda, R. O., P. E. Hart, M. J. Nilsson, "Subjective Bayesian methods for rule-based 
40 inference systems", Technical Note 124- Artificial Intelligence Center-SRI International. 
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Fua, P. V., "Using probability density functions in the framework of evidential reasoning 
Uncertainty in knowledge based systems", B. Bouchon, R. R. Yager, Eds. Sprineer Verla* 
(1987). 

Ishizuka, M, "Inference methods based on extended Dempster and Shafer's theory for 
5 problems with uncertainty/fuzziness". New Generation Computing, 1:159-168 (1983), Ohmsha.. 
Ltd, and Springer Verlag. 

Jeffrey, R. J., "The logic of decision", The University of Chicago Press, Ltd., London 
(1983) (2nd Ed.). 

Kaufmann, A., "Introduction a la theorie des sous-ensembles flous", Vol 1 2 et 
10 3-Masson-Paris (1975). 

Keeney, R. L., B. Raiffa, "Decisions with multiple objectives: Preferences and value 
tradeoffs", John Wiley and Sons, New York (1976). 

Ksienski et al., "Low Frequency Approach to Target Identification", Proc. of the IEEE 
63(12):1651-1660 (Dec. 1975). 

15 Kyburg, H. E., "Bayesian and non Bayesian evidential updating". Artificial Intelligence 

31:271-293 (1987). 

Roy, B., "Classements et choix en presence de points de vue multiples", R.I.R.O.-2eme 
annee-no. 8, pp. 57-75 (1968). 

Roy, B., "Electre III: un algorithme de classements fonde sur une representation floue des 
20 preferences en presence de criteres multiples", Cahiers du CERO, 20(l):3-24 (1978). 

Scharlic, A.,. "Decider sur plusieurs criteres. Panorama de l'aide a la decision multicritere" 
Presses Polytechniques Romandes (1985). 

Shafer, G., "A mathematical theory of evidence", Princeton University Press, Princeton. 
New Jersey (1976). 

25 Sugeno, M., "Theory of fuzzy integrals and its applications", Tokvo Institute of 

Technology (1974). 

Vannicola et al, "Applications of Knowledge based Systems to Surveillance", 
Proceedings of the 1988 IEEE National Radar Conference, 20-21 Apr. 1988, pp. 157-164. 
Yager, R. R., "Entropy and specificity in a mathematical theory of Evidence", Int. J. 
30 General Systems, 9:249-260 (1983). 

Zadeh, L. A., "Fuzzy sets as a basis for a theory of possibility", Fuzzy sets and Systems 
1:3-28 (1978). " ' 

Zadeh, L. A., "Fuzzy sets'", Information and Control, 8:338-353 (1965). 
Zadeh, L. A., "Probability measures of fuzzy events", Journal of Mathematical Analysis 
35 and Applications, 23:421-427 (1968). 

U.S. Patent No. 5,067,161, incorporated herein by reference, relates to a video image 
pattern recognition system, which recognizes objects in near real time. 

U.S. Patent Nos. 4,817,176 and 4,802,230, both incorporated herein by reference, relate 
40 to harmonic transform methods of pattern matching of an undetermined pattern to known 

patterns, and are useful in the pattern recognition method of the present invention. U.S. Patent 
4,998,286, incorporated herein by reference, relates to a harmonic transform method for 
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comparing multidimensional images, such as color images, and is useful in the present pattern 
recognition methods. 

U.S. Patent 5,067,166, incorporated herein by reference, relates to a pattern recognition 
system, in which a local optimum match between subsets of candidate reference label sequences 
and candidate templates. It is clear that this method is useful in the pattern recognition aspects of 
the present invention. It is also clear that the interface and control system of the present 
invention are useful adjuncts to the method disclosed in U.S. Patent 5,067,166. 

U.S. Patent 5,048,095, incorporated herein by reference, relates to the use of a genetic 
learning algorithm to adaptively segment images, which is an initial stage in image recognition. 
This patent has a software listing for this method. It is clear that this method is useful in the 
pattern recognition aspects of the present invention. It is also clear that the interface and control 
system of the present invention are useful adjuncts to the method disclosed in U.S. Patent 
5,048,095. 



FRACTAL-BASED IMAGE PROCESSING 

Fractals are a relatively new field of science and technology that relate to the study of 
order and chaos. While the field of fractals is now very dense, a number of relevant principles 
are applicable. First, when the coordinate axes of a space are not independent, and are related by 
a recursive algorithm, then the space is considered to have a fractional dimensionality. One 
characteristic of such systems is that a mapping of such spaces tends to have self -similarity on a 
number of scales. Interestingly, natural systems have also been observed to have self -similarity 
over several orders of magnitude, although as presently believed, not over an unlimited range of 
scales. Therefore, one theory holds that images of natural objects may be efficiently described 
by iterated function systems (IFS), which provide a series of parameters for a generic formula or 
algorithm, which, when the process is reversed, is visually similar to the starting image. Since 
the "noise" of the expanded data is masked by the "natural" appearance of the result, visually 
acceptable image compression may be provided at relatively high compression ratios. This 
theory remains the subject of significant debate, and, for example, wavelet algorithm advocates 
claim superior results for a more general set of starting images. It is noted that, on a 
mathematical level, wavelets and fractal theories have some common threads. 
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According to a particular embodiment of the invention, the expression of an image as an 
ordered set of coefficients of an algorithm, wherein the coefficients relate to elements of defined 
variation in scale, and the resulting set of coefficients is related to the underlying image 
morphology, is exploited in order to provide a means for pattern analysis and recognition without 
requiring decompression to an orthogonal coordinate space. 

U.S. Patent Nos. 5,065,447, and 4,941,193, both incorporated herein by reference, relate 
to the compression of image data by using fractal transforms. These are discussed in detail 
below. U.S. Patent 5,065,447 cites a number of references, relevant to the use of fractals in 
image processing: 

- U.S. Patent No. 4,831,659. 

"A New Class of Markov Processes for Image Encoding", School of Mathematics, 
Georgia Inst, of Technology (1988), pp. 14-32. 

"Construction of Fractal Objects with Iterated Function Systems", Siggraph '85 
Proceedings, 19(3):271-278 (1985). 

"Data Compression: Pntng by Numbrs", The Economist, May 21, 1988. 

"Fractal Geometry-Understanding Chaos", Georgia Tech Alumni Magazine, p. 16 (Spring 

1986). 

"Fractal Modelling of Biological Structures", Perspectives in Biological Dynamics and 
Theoretical Medicine, Koslow, Mandell, Shlesinger, eds., Annals of New York Academy of 
Sciences, vol. 504, 179-194 (date unknown). 

"Fractal Modelling of Real World Images, Lecture Notes for Fractals: Introduction, 
Basics and Perspectives", Siggraph (1987). . 

"Fractals-A Geometry of Nature", Georgia Institute of Technology Research Horizons, p. 
9 (Spring 1986). 

A. Jacquin, "A Fractal Theory of Iterated Markov Operators with Applications to Digital 
Image Coding ", PhD Thesis, Georgia Tech, 1989. 

A. Jacquin, "Image Coding Based on a Fractal Theory of Iterated Contractive Image 
Transformations " p. 18, January 1992 (Vol 1 Issue 1) of IEEE Trans on Image Processing. 

A. Jacquin, 'Fractal image coding based on a theory of iterated contractive image 
transformations*, Proc. SPIE Visual Communications and Image Processing, 1990, pages 227- 
239. 

A.E. Jacquin, 'A novel fractal block-coding technique for digital images', Proc. ICASSP 

1990. 

Baldwin, William, "Just the Bare Facts, Please", Forbes Magazine, Dec. 12, 1988. 
Barnsley et al., "A Better Way to Compress Images", Byte Magazine, Jan. 1988, pp. 213- 

225. 

Barnsley et al., "Chaotic Compression", Computer Graphics World, Nov. 1987. 
Barnsley et al., "Harnessing Chaos For Images Synthesis", Computer Graphics, 
22(4):131-140 (August, 1988). 
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Barnsley et al., "Hidden Variable Fractal Interpolation Functions", School of 
Mathematics, Georgia Institute of Technology, Atlanta, GA. 30332, Jul., 1986. 

Barnsley, M.F., "Fractals Everywhere", Academic Press, Boston, MA, 1988. 

Barnsley, M.F., and Demko, S., "Iterated Function Systems and The Global Construction 
of Fractals", Proc. R. Soc. Lond., A399:243-275 (1985). 

Barnsley, M.F., Ervin, V., Hardin, D., Lancaster, J., "Solution of an Inverse Problem for 
Fractals and Other Sets", Proc. Natl. Acad. Sci. U.S.A., 83:1975-1977 (Apr. 1986). 

Beaumont J M, "Image data compression using fractal techniques ", British Telecom 
Technological Journal 9(4):93-108 (1991). 

Byte Magazine, Jan. 1988, supra, cites: 

D.S. Mazel, Fractal Modeling of Time-Series Data, PhD Thesis, Georgia Tech, 1991. 
(One dimensional, not pictures). 

Derra, Skip, "Researchers Use Fractal Geometry, .", Research and Development 
Magazine, Mar. 1988. 

Elton, J., "An Ergodic Theorem for Iterated Maps", Journal of Ergodic Theory and 
Dynamical Systems, 7 (1987). 

Fisher Y, "Fractal image compression ", Siggraph 92. 

Fractal Image Compression Michael F. Barnsley and Lyman P. Hurd ISBN 0-86720-457- 
5, ca. 250 pp. 

Fractal Image Compression: Theory and Application, Yuval Fisher (ed.), Springer 
Verlag, New York, 1995. ISBN number 0-387-94211-4. 

Fractal Modelling of Biological Structures, School of Mathematics, Georgia Institute of 
Technology (date unknown). 

G.E. Oien, S. Lepsoy & T.A. Ramstad, 'An inner product space approach to image coding 
by contractive transformations', Proc. ICASSP 1991, pp 2773-2776. 

Gleick, James, "Making a New Science", pp. 215, 239, date unknown. 

Graf S, "Barnsley's Scheme for the Fractal Encoding of Images ", Journal Of 
Complexity, V8, 72-78 (1992). 

Jacobs, E.W., Y. Fisher and R.D. Boss. "Image Compression: A study of the Iterated 
Transform Method. Signal Processing 29, (1992) 25-263. 

M. Barnsley, L. Anson, "Graphics Compression Technology, Sun World, October 1991, 
pp. 42-52. 

M.F. Barnsley, A. Jacquin, F. Malassenet, L. Reuter & A.D. Sloan, 'Harnessing chaos for 
image synthesis', Computer Graphics, vol 22 no 4 pp 131-140, 1988. 

M.F. Barnsley, A.E. Jacquin, 'Application of recurrent iterated function systems to 
images', Visual Comm. and Image Processing, vol SPIE-1001, 1988. 

Mandelbrot, B., "The Fractal Geometry of Nature", W.H. Freeman & Co., San Francisco, 
CA, 1982, 1977. 

Monro D M and Dudbridge F, "Fractal block coding of images ", Electronics Letters 
28(11):1053-1054(1992). 

Monro D.M. & Dudbridge F. 'Fractal approximation of image blocks', Proc ICASSP 92, 
pp. Ill: 485-488. 

Monro D.M. 'A hybrid fractal transform', Proc ICASSP 93, pp. V: 169-72. 
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Monro D.M., Wilson D., Nicholls J.A. 'High speed image coding with the Bath Fractal 
Transform', IEEE International Symposium on Multimedia Technologies Southampton, April 

Peterson, Wars, "Packing It In-Fractals .." , Science News, 131(18):283-285 (Mav 2 
1987). 3 ' 

S. A. Hollatz, "Digital image compression with two-dimensional affine fractal 
interpolation functions ", Department of Mathematics and Statistics, University of Minnesota- 
Duluth, Technical Report 91-2. (a nuts-and-bolts how-to-do-it paper on the technique). 

Stark, J., "Iterated function systems as neural networks ", Neural Networks Vol 4 pp 
679-690, Pergamon Press, 1991. 

Vrscay, Edward R. "Iterated Function Systems: Theory, Applications, and the Inverse 
Problem. Fractal Geometry and Analysis, J. Belair and S. Dubuc (eds.) Kluwer Academic 1991 
405-468. 

U.S. Patent No. 5,347,600, incorporated herein by reference, relates to a method and 
apparatus for compression and decompression of digital image data, using fractal methods. 
According to this method, digital image data is automatically processed by dividing stored image 
data into domain blocks and range blocks. The range blocks are subjected to processes such as a 
shrinking process to obtain mapped range blocks. The range blocks or domain blocks may also 
be processed by processes such as affine transforms. Then, for each domain block, the mapped 
range block which is most similar to the domain block is determined, and the address of that 
range block and the processes the blocks were subjected to are combined as an identifier which is 
appended to a list of identifiers for other domain blocks. The list of identifiers for all domain 
blocks is called a fractal transform and constitutes a compressed representation of the input 
image. To decompress the fractal transform and recover the input image, an arbitrary input 
image is formed into range blocks and the range blocks processed in a manner specified by the 
identifiers to form a representation of the original input image. 

"Image Compression Using Fractals and Wavelets", Final Report for the Phase II 
Contract Sponsored by the Office of Naval Research, Contract No. N00014-91-C-0117, 
Netrologic Inc., San Diego, California (June 2, 1993), relates to various methods of compressing 
image data, including fractals and wavelets. This method may also be applicable in pattern 
recognition applications. This reference provides theory and comparative analysis of 
compression schemes. 

A fractal-processing method based image extraction method is described in Kim, D.H.; 
Caulfield, H.J.; Jannson, T.; Kostrzewski, A.; Savant, G, "Optical fractal image processor for 
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noise-embedded targets detection", Proceedings of the SPTF. - Th* Internafinnai Snr.Vfy 
Optical Engineerin g, Vol. 2026, p. 144-9 (1993) (SPIE Conf: Photonics for Processors, Neural 
Networks, and Memories 12-15 July 1993, San Diego, CA, USA). According to this paper, a 
fractal dimensionality measurement and analysis-based automatic target recognition (ATR) is 
described. The ATR is a multi-step procedure, based on fractal image processing, and can 
simultaneously perform preprocessing, interest locating, segmenting, feature extracting, and 
classifying. See also, Cheong, C.K.; Aizawa, K.; Saito, T; Hatori, M., "Adaptive edge detection 
with fractal dimension", Transaction s of the Institute of Electronics. Information and 
Communication Engineers D-TT, J76D-II(ll):2459-63 (1993); Hayes, H.I.; Solka, J.L.; Priebe, 
C.E.; "Parallel computation of fractal dimension", Proceedings of the SPIE - The International 
Society for Optical Engineering, 1962:219-30 (1993); Priebe, C.E.; Solka, J.L.; Rogers, G.W., 
"Discriminant analysis in aerial images using fractal based features", Proceedings of the SPTE - 
The International Society for Optical Engineering. 1962:196-208(1993). See also, Anson, L., 
"Fractal Image Compression", Byte, October 1993, pp. 195-202; "Fractal Compression Goes On- 
Line", Byte, September 1993. 

Methods employing other than fractal-based algorithms may also be used. See, e.g., Liu, 
Y., "Pattern recognition using Hilbert space", Proceedings of the SPIF. - The International 
Society for Optical Engineering, 1825:63-77 (1992), which describes a learning approach, the 
Hilbert learning. This approach is similar to Fractal learning, but the Fractal part is replaced by 
Hilbert space. Like the Fractal learning, the first stage is to encode an image to a small vector in 
the internal space of a learning system. The next stage is to quantize the internal parameter 
space. The internal space of a Hilbert learning system is defined as follows: a pattern can be 
interpreted as a representation of a vector in a Hilbert space. Any vectors in a Hilbert space can 
be expanded. If a vector happens to be in a subspace of a Hilbert space where the dimension L of 
the subspace is low (order of 10), the vector can be specified by its norm, an L- vector, and the 
Hermitian operator which spans the Hilbert space, establishing a mapping from an image space 
to the internal space P. This mapping converts an input image to a 4-tuple: t in P=(Norm, T, N, 
L- vector), where T is an operator parameter space, N is a set of integers which specifies the 
boundary condition. The encoding is implemented by mapping an input pattern into a point in its 
internal space. The system uses local search algorithm, i.e., the system adjusts its internal data 
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locally. The search is first conducted for an operator in a parameter space of operators, then an 
error function delta (t) is computed. The algorithm stops at a local minimum of delta (t). Finally, 
the input training set divides the internal space by a quantization procedure. See also, Liu, Y. f 
"Extensions of fractal theory", Procee dings of the SPTE - The International Society for Op tica] 
Engineering, 1966:255-68(1993). 

Fractal methods may be used for pattern recognition. See, Sadjadi, F., "Experiments in 
the use of fractal in computer pattern recognition", Proceeding s of the SPIE - The International 
Society for Optical Engineering, 1960:214-22(1993). According to this reference, man-made 
objects in infrared and millimeter wave (MMW) radar imagery may be recognized using fractal- 
based methods. The technique is based on estimation of the fractal dimensions of sequential 
blocks of an image of a scene and slicing of the histogram of the fractal dimensions computed by 
Fourier regression. The technique is shown to be effective for the detection of tactical military 
vehicles in IR, and of airport attributes in MMW radar imagery. 

In addition to spatial self-similarity, temporal self-similarity may also be analyzed using 
fractal methods. See, Reusens, E., "Sequence coding based on the fractal theory of iterated 
transformations systems", Proceedings of the SPIE - The International Society for Optical 
Engineering, 2094(pt.l):132-40(1993). This reference describes a scheme based on the iterated 
functions systems theory which relies on a 3D approach in which the sequence is adaptively 
partitioned. Each partition block can be coded either by using the spatial self similarities or by 
exploiting temporal redundancies. 

Fractal compression methods may be used for video data for transmission. See, Hurtgen, 
B.; Buttgen, P., "Fractal approach to low rate video coding", Proceedings of the SPIE - The 
International Society for Optical Engineering. 2094(pt.l): 120-31 (1993). This reference relates to 
a method for fast encoding and decoding of image sequences on the basis of fractal coding theory 
and the hybrid coding concept. The DPCM-loop accounts for statistical dependencies of natural 
image sequences in the temporal direction. Those regions of the original image where the 
prediction, i.e. motion estimation and compensation, fails are encoded using an advanced fractal 
coding scheme, suitable for still images, and whose introduction instead of the commonly used 
Discrete Cosine Transform (DCT) -based coding is advantageous especially at very low bit rates 
(8-64 kbit/s). In order to increase reconstruction quality, encoding speed and compression ratio, 
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some additional features such as hierarchical codebook search and multilevel block segmentation 
may be employed. This hybrid technique may be used in conjunction with the present adaptive 
interface or other features of the present invention. 

Fractal methods may be used to segment an image into objects having various surface 
5 textures. See, Zhi-Yan Xie; Brady, M, "Fractal dimension image for texture segmentation", 
ICARCV '92. Second International Conference on Automation, Robotics and Computer Vision, 
p. CV-4.3/1-5 vol.1, (1992). According to this reference, the fractal dimension and its change 
over boundaries of different homogeneous textured regions is analyzed and used to segment 
textures in infrared aerial images. Based on the fractal dimension, different textures map into 

10 different fractal dimension image features, such that there is smooth variation within a single 
homogeneous texture but sharp variation at texture boundaries. Since the fractal dimension 
remains unchanged under linear transformation, this method is robust for dismissing effects 
caused by lighting and other extrinsic factors. Morphology is the only tool used in the 
implementation of the whole process: texture feature extraction, texture segmentation and 

15 boundary detection. This makes possible parallel implementations of each stage of the process. 

Rahmati, M.; Hassebrook, L.G., "Intensity- and distortion-invariant pattern recognition 
with complex linear morphology", Pattern Recognition . 27 (4):549-68(1994) relates to a unified 
model based pattern recognition approach is introduced which can be formulated into a variety of 
techniques to be used for a variety of applications. In this approach, complex phasor addition and 

20 cancellation are incorporated into the design of filter(s) to perform implicit logical operations 
using linear correlation operators. These implicit logical operations are suitable to implement 
high level gray scale morphological transformations of input images. In this way non-linear 
decision boundaries are effectively projected into the- input signal space yet the mathematical 
simplicity of linear filter designs is maintained. This approach is applied to the automatic 

25 distortion- and intensity-invariant object recognition problem. A set of shape operators or 
complex filters is introduced which are logically structured into a filter bank architecture to 
accomplish the distortion and intensity-invariant system. This synthesized complex filter bank is 
optimally sensitive to fractal noise representing natural scenery. The sensitivity is optimized for 
a specific fractal parameter range using the Fisher discriminant. The output responses of the 

30 proposed system are shown for target, clutter, and pseudo-target inputs to represent its 



Hoffberg et al. 



-22- 



LIH-10 



discrimination and generalization capability in the presence of distortion and intensity variations. 
Its performance is demonstrated with realistic scenery as well as synthesized inputs. 

Sprinzak, J.; Werman, M., "Affine point matching", Pattern Recognition Letters , 
15(4):337-9(1994), relates to a pattern recognition method. A fundamental problem of pattern 
5 recognition, in general, is recognizing and locating objects within a given scene. The image of 
an object may have been distorted by different geometric transformations such as translation, 
rotation, scaling, general affine transformation or perspective projection. The recognition task 
involves finding a transformation that superimposes the model on its instance in the image. This 
reference proposes an improved method of superimposing the model. 

10 

TEMPORAL IMAGE ANALYSIS 

Temporal image analysis is a well known field. This field holds substantial interest at 
present for two reasons. First, by temporal analysis of a series of two dimensional images, 
objects and object planes may be defined, which provide basis for efficient yet general 

15 algorithms for video compression, such as the Motion Picture Experts Group (MPEG) series of 
standards. Second, temporal analysis has applications in signal analysis for an understanding and 
analysis of the signal itself. 

U.S. Patent No. 5,280,530, incorporated herein by reference, relates to a method and 
apparatus for tracking a moving object in a scene, for example the face of a person in videophone 

20 applications, comprises forming an initial template of the face, extracting a mask outlining the 
face, dividing the template into a plurality (for example sixteen) sub-templates, searching the 
next frame to find a match with the template, searching the next frame to find a match with each 
of the sub-templates, determining the displacements of each of the sub-templates with respect to 
the template, using the displacements to determine affine transform coefficients and performing 

25 an affine transform to produce an updated template and updated mask. 

U.S. Patent No. 5,214,504 relates to a moving video image estimation system, based on 
an original video image of time n and time n+1, the centroid, the principal axis of inertia, the 
moment about the principal axis of inertia and the moment about the axis perpendicular to the 
principal axis of inertia are obtained. By using this information, an affine transformation for 

30 transforming the original video image at time n to the original video image at time n+1 is 
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obtained. Based on the infinitesimal transformation (A), {e At , and e A(M) } obtained by making the 
affine transformation continuous with regard to time is executed on the original video image at 
time n and time n+1. The results are synthesized to perform an interpolation between the frames. 
{e (a(M) } is applied to the original video system time n+1. The video image after time n+1 is 
thereby protected. 

U.S. Patent No. 5,063,603, incorporated herein by reference, relates to a dynamic method 
for recognizing objects and image processing system therefor. This reference discloses a method 
of distinguishing between different members of a class of images, such as human beings. A time 
series of successive relatively high-resolution frames of image data, any frame of which may or 
may not include a graphical representation of one or more predetermined specific members (e.g., 
particular known persons) of a given generic class (e.g. human beings), is examined in order to 
recognize the identity of a specific member; if that member's image is included in the time series. 
The frames of image data may be examined in real time at various resolutions, starting with a 
relatively low resolution, to detect whether some earlier-occurring frame includes any of a group 
of image features possessed by an image of a member of the given class. The image location of a 
detected image feature is stored and then used in a later-occurring, higher resolution frame to 
direct the examination only to the image region of the stored location in order to (1) verify the 
detection of the aforesaid image feature, and (2) detect one or more other of the group of image 
features, if any is present in that image region of the frame being examined. By repeating this 
type of examination for later and later occurring frames, the accumulated detected features can 
first reliably recognize the detected image region to be an image of a generic object of the given 
class, and later can reliably recognize the detected image region to be an image of a certain 
specific member of the given class. Thus, a human identity recognition feature of the present 
invention may be implemented in this manner. Further, it is clear that this recognition feature 
may form an integral part of certain embodiments of the present invention. . It is also clear that 
the various features of the present invention would be applicable as an adjunct to the various 
elements of the system disclosed in U.S. Patent 5,063,603. 

U.S. Patent No. 5,067,160, incorporated herein by reference, relates to a motion-pattern 
recognition apparatus, having adaptive capabilities. The apparatus recognizes a motion of an 
object that is moving and is hidden in an image signal, and discriminates the object from the 
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background within the signal. The apparatus has an image-forming unit comprising non-linear 
oscillators, which forms an image of the motion of the object in accordance with an adjacent- 
mutual-interference-rule, on the basis of the image signal. A memory unit, comprising 
non-linear oscillators, stores conceptualized meanings of several motions. A retrieval unit 
5 retrieves a conceptualized meaning close to the motion image of the object. An altering unit 
alters the rule, on the basis of the conceptualized meaning. The image forming unit, memory 
unit, retrieval unit and altering unit form a holonic-loop. Successive alterations of the rules by 
the altering unit within the holonic loop change an ambiguous image formed in the image 
forming unit into a distinct image. U.S. Patent 5,067,160 cites the following references, which 
10 are relevant to the task of discriminating a moving object in a background: 
U.S. Patent No. 4,710,964. 

Shimizu et al, "Principle of Holonic Computer and Holovision", Journal of the Institute of 
Electronics, Information and Communication, 70(9):921-930 (1987). 

Omata et al, "Holonic Model of Motion Perception", IEICE Technical Reports, 3/26/88, 
15 pp. 339-346. 

Ohsuga et al, "Entrainment of Two Coupled van der Pol Oscillators by an External 
Oscillation", Biological Cybernetics, 51:225-239 (1985). 

U.S. Patent No. 5,065,440, incorporated herein by reference, relates to a pattern 
20 recognition apparatus, which compensates for, and is thus insensitive to pattern shifting, thus 
being useful for decomposing an image or sequence of images, into various structural features 
and recognizing the features. U.S. Patent 5,065,440 cites the following references, incorporated 
herein by reference, which are also relevant to the present invention: U.S. Patent Nos. 4,543,660, 
4,630,308, 4,677,680, 4,809,341, 4,864,629, 4,872,024 and 4,905,296. 
25 Recent analyses of fractal image compression techniques have tended to imply that, other 

than in special circumstances, other image compression methods are "better" than a Barnsley- 
type image compression system, due to the poor performance of compression* processors and 
lower than expected compression ratios. Further, statements attributed to Barnsley have 
indicated that the Barnsley technique is not truly a "fractal" technique, but rather a vector 
30 quantization process which employs a recursive library. Nevertheless, these techniques and 
analyses have their advantages. As stated hereinbelow, the fact that the codes representing the 
compressed image are hierarchical represents a particular facet exploited by the present 
invention. 
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Another factor which makes fractal methods and analysis relevant to the present 
invention is the theoretical relation to optical image processing and holography. Thus, while 
such optical systems may presently be cumbersome and economically unfeasible, and their 
implementation in software models slow, these techniques nevertheless hold promise and present 
distinct advantages. 

BIOMETRIC ANALYSIS 

Biometric analysis comprises the study of the differences between various organisms, 
typically of the same species. Thus, the intraspecies variations become the basis for 
differentiation and identification. In practice, there are many applications for biometric analysis 
systems, for example in security applications, these allow identification of a particular human. 

U.S. Patent No. 5,055,658, incorporated herein by reference, relates to a security system 
employing digitized personal characteristics, such as voice. The following references are cited: 

"Voice Recognition and Speech Processing", Elektor Electronics, Sep. 1985, pp. 56-57. 

Naik et al., "High Performance Speaker Verification .", ICASSP 86, Tokyo, 
CH2243-4/86/0000-0881, IEEE 1986, pp. 881-884. 

Shinan et al., "The Effects of Voice Disguise .", ICASSP 86, Tokyo, 
CH2243-4/86/0000-0885, IEEE 1986, pp. 885-888. 

Parts of this system relating to speaker recognition may be used to implement a voice 
recognition system of the present invention for determining an actor or performer in a broadcast. 

NEURAL NETWORKS 

Neural networks are a particular type of data analysis tool. There are characterized by the 
fact that the network is represented by a set of "weights", which are typically scalar values, 
which are derived by a formula which is designed to reduce the error between the a data pattern 
representing a known state and the network's prediction of that state. These networks, when 
provided with sufficient complexity and an appropriate training set, may be quite sensitive and 
precise. Further, the data pattern may be arbitrarily complex (although the computing power 
required to evaluate the output will also grow) and therefore these systems may be employed for 
video and other complex pattern analysis. 
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U.S. Patent No. 5,067,164, incorporated herein by reference, relates to a hierarchical 

constrained automatic learning neural network for character recognition, and thus represents an 

example of a trainable neural network for pattern recognition, which discloses methods which are 

useful for the present invention. This Patent cites various references of interest: 

5 U. S. Patent Nos. 4,760,604, 4,774,677 and 4,897,81 1 . 

LeCun, Y., Connectionism in Perspective, R. Pfeifer, Z. Schreter, F. Fogelman, L. Steels, 
(Eds.), 1989, "Generalization and Network Design Strategies", pp. 143-55. 

LeCun, Y., et al., "Handwritten Digit Recognition: Applications of Neural.", IEEE 
Comm. Magazine, pp. 41-46 (Nov. 1989). 
10 Lippmann, R. P., "An Introduction to Computing with Neural Nets", IEEE ASSP 

Magazine, 4(2):4-22 (Apr. 1987). 

Rumelhart, D. E., et al., Parallel Distr. Proa: Explorations in Microstructure of 
Cognition, vol. 1, 1986, "Learning Internal Representations by Error Propagation", pp. 318-362. 

15 U.S. Patents 5,048,100, 5,063,601 and 5,060,278, all incorporated herein by reference, 

also relate to neural network adaptive pattern recognition methods and apparatuses. It is clear 
that the methods of 5,048,100, 5,060,278 and 5,063,601 may be used to perform the adaptive 
pattern recognition functions of the present invention. More general neural networks are 
disclosed in U.S. Patents 5,040,134 and 5,058,184, both incorporated herein be reference, which 

20 provide background on the use of neural networks. In particular, 5,058,184 relates to the use of 
the apparatus in information processing and feature detection applications. 

U.S. Patent No. 5,058,180, incorporated herein by reference, relates to neural network 
apparatus and method for pattern recognition, and is thus relevant to the intelligent pattern 
recognition functions of the present invention. This patent cites the following documents of 

25 interest: 

U.S. Patent Nos. 4,876,731 and 4,914,708. 

Carpenter, G. A., S. Grossberg, "The Art of Adaptive Pattern Recognition by a 
Self-Organizing Neural Network," IEEE Computer, Mar. 1988, pp. 77-88. 

Computer Visions, Graphics, and Image Processing 1987, 37:54-115. 
30 Grossberg, S., G. Carpenter, "A Massively Parallel Architecture for a Self-Organizing 

Neural Pattern Recognition Machine," Computer Vision, Graphics, and Image Processing (1987, 
37, 54-115), pp. 252-315. 

Gullichsen E., E. Chang, "Pattern Classification by Neural Network: An Experiment 
System for Icon Recognition," ICNN Proceeding on Neural Networks, Mar. 1987, pp. 
35 IV-725-32. 

Jackel, L. D., H. P. Graf, J. S. Denker, D. Henderson and I. Guyon, "An Application of 
Neural Net Chips: Handwritten Digit Recognition," ICNN Proceeding, 1988, pp. 11-107-15. 
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Lippman, R. P., "An Introduction to Computing with Neural Nets," IEEE ASSP 
Magazine, Apr. 1987, pp. 4-22. 

Pawlicki, T. R, D. S. Lee, J. J. Hull and S. N. Srihari, "Neural Network Models and their 
Application to Handwritten Digit Recognition," ICNN Proceeding, 1988, pp. 11-63-70. 

Chao, T.-H.; Hegblom, E.; Lau, B.; Stoner, W.W.; Miceli, W.J., "Optoelectronically 
implemented neural network with a wavelet preprocessor", Proceedings of the SPIE - The. 
International Society for Optical Engineering, 2026:472-82(1993), relates to an optoelectronic 
neural network based upon the Neocognitron paradigm which has been implemented and 
successfully demonstrated for automatic target recognition for both focal plane array imageries 
and range-Doppler radar signatures. A particular feature of this neural network architectural 
design is the use of a shift-invariant multichannel Fourier optical correlation as a building block 
for iterative multilayer processing. A bipolar neural weights holographic synthesis technique 
was utilized to implement both the excitatory and inhibitory neural functions and increase its 
discrimination capability. In order to further increase the optoelectronic Neocognitron's 
self-organization processing ability, a wavelet preprocessor was employed for feature extraction 
preprocessing (orientation, size, location, etc.). A multichannel optoelectronic wavelet processor 
using an e-beam complex-valued wavelet filter is also described. 

Neural networks are important tools for extracting patterns from complex input sets. 
These systems do not require human comprehension of the pattern in order to be useful, although 
human understanding of the nature of the problem is helpful in designing the neural network 
system, as is known in the art. Feedback to the neural network is integral to the training process. 
Thus, a set of inputs is mapped to a desired output range, with the network minimizing an 
"error" for the training data set. Neural networks may differ based on the computation of the 
"error ,, J the optimization process, the method of altering the network to minimize the error, and 
the internal topology. Such factors are known in the art. 

OPTICAL PATTERN RECOGNITION 

Optical image processing holds a number of advantages. First, images are typically 
optical by their nature, and therefore processing by this means may (but not always) avoid a data 
conversion. Second, many optical image processing schemes are inherently or easily performed 
in parallel, improving throughput. Third, optical circuits typically have response times shorter 
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than electronic circuits, allowing potentially short cycle times. While many optical phenomena 

may be modeled using electronic computers, appropriate applications for optical computing, such 

as pattern recognition, hold promise for high speed in systems of acceptable complexity. 

U.S. Patent No. 5,060,282, incorporated herein by reference, relates to an optical pattern 

5 recognition architecture implementing the mean-square error correlation algorithm. This method 

allows an optical computing function to perform pattern recognition functions. U.S. Patent No. 

5,060,282 cites the following references, which are relevant to optical pattern recognition: 

Kellman, P., "Time Integrating Optical Signal Processing". Ph. D. Dissertation. Stanford 
University, 1979, pp. 51-55. 
10 Molley, P., "Implementing the Difference-Squared Error Algorithm Using An 

Acousto-Optic Processor", SPIE, 1098:232-239, (1989). 

Molley, P., et al., "A High Dynamic Range Acousto-Optic Image Correlator for 
Real-Time Pattern Recognition", SPIE, 938:55-65 (1988). 

Psaltis, D., "Incoherent Electro-Optic Image Correlator", Optical Engineering, 
15 23(1):12-15 (Jan./Feb. 1984). 

Psaltis, D., "Two-Dimensional Optical Processing Using One-Dimensional Input 
Devices", Proceedings of the IEEE, 72(7):962-974 (Jul. 1984). 

Rhodes, W., "Acousto-Optic Signal Processing: Convolution and Correlation", Proc. of 
the IEEE, 69(l):65-79 (Jan. 1981). 
20 Vander Lugt, A., "Signal Detection By Complex Spatial Filtering", IEEE Transactions 

On Information Theory, IT-10, 2: 139-145 (Apr. 1964). 

U.S. Patent Nos. 5,159,474 and 5,063,602, expressly incorporated herein by reference, 
also relate to optical image correlators. Also of interest is Li, H.Y., Y. Qiao and D. Psaltis, 

25 Applied Optics (April, 1993). See also, Bains, S., "Trained Neural Network Recognizes Faces", 
Laser Focus World, June, 1993, pp. 26-28; Bagley, H. & Sloan, J., "Optical Processing: Ready 
For Machine Vision?", Photonics Spectra, August 1993, pp. 101-106. 

Optical pattern recognition has been especially applied to two dimensional patterns. In an 
optical pattern recognition system, an image is correlated with a set of known image patterns 

30 represented on a hologram, and the product is a pattern according to a correlation between the 
input pattern and the provided known patterns. Because this is an optical technique, it is 
performed nearly instantaneously, and the output information can be reentered into an electronic 
digital computer through optical transducers known in the art. Such a system is described in 
Casasent, D., Photonics Spectra, November 1991, pp. 134-140. The references cited therein 

35 provide further details of the theory and practice of such a system: Lendaris, G.G., and Stanely, 
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G.L., "Diffraction Pattern Sampling for Automatic Target Recognition", Proc. IEEE 58:198-205 
(1979); Ballard, D.H., and Brown, CM., Computer Vision, Prentice Hall, Englewood Cliffs, N.J 
(1982); Optical Engineering 28:5 (May 1988)(Special Issue on product inspection); Richards J., 
and Casasent, D., "Real Time Hough Transform for Industrial Inspection" Proc. SPIE Technical 
5 Symposium, Boston 1989 1192:2-21 (1989); Maragos, P., "Tutorial Advances in Morphological 
Image Processing" Optical Engineering 26:7:623-632 (1987); Casasent, D., and Tescher, A., 
Eds., "Hybrid Image and Signal Processing II", Proc. SPIE Technical Symposium, April 1990, 
Orlando FL. 1297 (1990); Ravichandran, G. and Casasent, D., "Noise and Discrimination 
Performance of the MINACE Optical Correlation Filter", Proc. SPIE Technical Symposium, 
10 April 1990, Orlando FL., 1471 (1990); Weshsler, H. Ed., "Neural Nets For Human and Machine 
Perception", Academic Press, New York (1991). 

By employing volume holographic images, the same types of paradigms may be applied 
to three dimensional images. 

1 5 QUERY BY IMAGE CONTENT 

Query by image content, a phrase coined by IBM researchers, relates to a system for 
retrieving image data stored in a database on the basis of the colors, textures, morphology or 
objects contained within the image. Therefore, the system characterizes the stored images to 
generate a metadata index, which can then be searched. Unindexed searching is also possible. 

20 A number of query by image content systems are known, including both still and moving 

image systems, for example from IBM (QBIC), Apple (Photobook), Belmont Research Inc. 
(Steve Gallant), BrainTech Inc.; Center for Intelligent Information Retrieval (Umass Amherst), 
Virage, Inc., Informix Software, Inc. (Illustra), Islip Media, Inc., Magnifi, Numinous 
Technologies, Columbia University VisualSeek/WebSeek (Chang et al., John R. Smith), Monet 

25 (CWI and UvA), Visual Computing Laboratory, UC San Diego (ImageGREP, White and Jain). 
See also, ISO/IEC MPEG-7 literature. 

See, Jacobs, et al., "Fast Multiresolution Image Querying", Department of Computer 
Science, University of Washington, Seattle Washington. 
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U.S. Patent No. 5,655,117, expressly incorporated herein by reference, relates to a 
method and apparatus for indexing multimedia information streams for content-based retrieval. 
See also: 

Gong et al, "An Image Database System with Content Capturing and Fast Image Indexing 
5 Abilities", PROC of the International Conference on Multimedia Computing and Systems pp 
121-130 May 19, 1994. 

Hongjiang, et al., Digital Libraries, "A Video Database System for Digital Libraries", pp 
253-264, May 1994. 

S. Abe and Y. Tonomura, Systems and Computers in Japan, vol. 24, No. 7, "Scene 
10 Retrieval Method Using Temporal Condition Changes", pp. 92-101, 1993. 

Salomon et al, "Using Guides to Explore Multimedia Databases", PROC of the Twenty- 
Second Annual Hawaii International Conference on System Sciences, vol. IV, 3-6 Jan. 1989 pp 
3-12 vol. 4. Jan. 6, 1989. 

Stevens, "Next Generation Network and Operating System Requirements for Continuous 
15 Time Media", in Herrtwich (Ed.), Network and Operating System Support for Digital Audio and 
Video, pp. 197-208, Nov. 1991. 

U.S. Patent No. 5,606,655, expressly incorporated herein by reference, relates to a 

method for representing contents of a single video shot using frames. The method provides a 

20 representative frame (Rframe) for a group of frames in a video sequence, selecting a reference 

frame from the group of frames and storing the reference frame in a computer memory. This 

system defines a peripheral motion tracking region along an edge of the reference frame and 

successively tracks movement of boundary pixels in the tracking region, symbolizing any of the 

length of the shot and the presence of any caption. See, also: 

25 "A Magnifier Tool for Video Data", Mills et al., Proceedings of ACM Computer Human 

Interface (CHI), May 3-7, 1992, pp. 93-98. 

"A New Family of Algorithms for Manipulating Compressed Images", Smith et al., IEEE 
Computer Graphics and Applications, 1993. 

"Anatomy of a Color Histogram", Novak et al., Proceeding of Computer Vision and 
30 Pattern Recognition, Champaign, IL, Jun. 1992, pp. 599-605. 

"Automatic Structure Visualization for Video Editing", Ueda et al., InterCHr93 
Conference Proceedings, Amsterdam, The Netherlands, 24-29 Apr. 1993, pp. 137-141. 

"Automatic Video Indexing and Full- Video Search for Object Appearances", Nagasaka et 
al., Proceedings of the IFIP TC2AVG2.6 Second Working Conference on Visual Database 
35 Systems, North Holland, Sep. 30-Oct. 3, 1991, pp. 113-127. 

"Color Indexing", Swain et al., International Journal of Computer Vision, vol. 7, No. 1, 
1991, pp. 11-32. 

"Content Oriented Visual Interface Using Video Icons for Visual Database Systems", 
Tonomura et aL Journal of Visual Languages and Computing (1990) 1, pp. 183-198. 
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"Developing Power Tools for Video Indexing and Retrieval", Zhang et al., Proceedings 
of SPIE Conference on Storage and Retrieval for Image and Video Databases, San Jose CA 
1994. 

"Image Information Systems: Where Do We Go From Here?", Chang et al., IEEE 
5 Transactions on Knowledge and Data Engineering, vol. 4, No. 5, Oct. 1992, pp. 43 1-442. 

"Image Processing on Compressed Data for Large Video Databases", Arman et al., 
Proceedings of First ACM International Conference on Multimedia, Anaheim, CA, 1-6 Aug 
1993, pp. 267-272. 

"Image Processing on Encoded Video Sequences", Arman et al., ACM Multimedia 
10 Systems Journal, to appear 1994. 

"Impact: An Interactive Natural-Motion-Picture Dedicated Multimedia Authoring 
System", Ueda et al., Proceedings of Human Factors in Computing Systems (CHI 91), New 
Orleans, LA, Apr. 27-May 2, 1991, pp. 343-350. 

"MPEG: A Video Compression Standard for Multimedia Applications", Le Gall, 
15 Communications of the ACM, vol. 34, No. 4, Apr. 1991, pp. 47-58. 

"News On-Demand for Multimedia Networks", Miller et al., ACM International 
Conference on Multimedia, Anaheim, CA, 1-6, Aug. 1993, pp. 383-392. 

"Overview of the px64 kbit/s Video Coding Standard", Liou, Communications of the 
ACM, vol. 34, No. 4, Apr. 1991, pp. 60-63. 
20 "Pattern Recognition by Moment Invariants", Hu et al., Proc. IRE, vol. 49, 1961, p. 1428. 

"Pattern Recognition Experiments in the Mandala/Cosine Domain", Hsu et al., IEEE 
Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-5, No. 5, Sept. 1983, pp 
512-520. 

"The JPEG Still Picture Compression Standard", Wallace, Communications of the ACM, 
25 vol. 34, No. 4, Apr. 1991, pp. 31-44. 

"The Revised Fundamental Theorem of Moment Invariants", Reiss, IEEE Transactions 
on Pattern Analysis and Machine Intelligence, vol. 13, No. 8, Aug. 1991, pp. 830-834. 

"VideoMAP and VideoSpacelcon: Tools for Anatomizing Video Content", Tonomura et 
al., Inter CHI'93 Conference Proceedings, Amsterdam, The Netherlands. 24-29 Apr., 1993, pp. 
30 131-136. 

"Visual Pattern Recognition by Moment Invariants", IRE Trans. Inform. Theory, vol. 8, 
Feb. 1962, pp. 179-187. 

"Watch - Grab - Arrange - See: Thinking with Motion Images via Streams and Collages", 
Elliott, Ph.D. Thesis, MIT, Feb. 1993. 
35 Book entitled Digital Image Processing, by Gonzalez et al., Addison- Wesley, Readings, 

MA, 1977. 

Book entitled Digital Picture Processing by Rosenfeld et al., Academic Press, Orlando, 
Florida, 1982. 

Book entitled Discrete Cosine Transform - Algorithms, Advantages, Applications, by 
40 Rao et al., Academic Press, Inc., 1990. 

Book entitled Sequential Methods in Pattern Recognition and Machine Learning, Fu, 
Academic, NY, NY 1968. 

C.-C. J. Kuo (ed), "Multimedia Storage and Archiving Systems", SPIE Proc. Vol. 2916 
(11/18-11/22/96). 



Hoffberg et al. 



-32- 



LIH-10 



U.S. Patent No. 5,600,775, expressly incorporated herein by reference, relates to a 
method and apparatus for annotating full motion video and other indexed data structures. U.S. 
Patent No. 5,428,774, expressly incorporated herein by reference relates to a system of updating 
an index file of frame sequences so that it indexes non-overlapping motion image frame 
5 sequences. U.S. Patent No. 5,550,965, expressly incorporated herein by reference, relates to a 
method and system for operating a data processor to index primary data in real time with iconic 
table of contents. U.S. Patent No. 5,083,860, expressly incorporated herein by reference, relates 
to a method for detecting change points in motion picture images. U.S. Patent No. 5,179,449, 
expressly incorporated herein by reference, relates to a scene boundary detecting apparatus. See 
10 also: 

"A show and tell of the QBIC technology -Query By Image Content (QBIC)", IBM 
QBIC Almaden web site, pp. 1-4. 

"Chaos & Non-Linear Models in Economics". 

"Chaos Theory in the Financial Markets. Applying Fractals, Fuzzy Logic, Genetic 
15 Algorithms". 

"Evolutionary Economics & Chaos Theory". - 
"Four Eyes", MIT Media Lab web site; pp. 1-2. 

"Frequently asked questions about visual information retrieval", Virage Incorporated web 
site; pp. 1-3. 

20 "IBM Ultimedia Manager 1.1 and Clinet Search", IBM software web site, pp. 1-4. 

"Image Detection and Registration", Digital Image Processing, Pratt, Wiley, New York, 

1991. 

"Jacob Methodology" @ httpr/AVWCS ALdiepa.unipa.it/research/projects/jacob/jacob- 
method.html. 

25 "Market Analysis. Applying Chaos Theory to Invstment & Economics". 

"Photobook", MIT Media Lab web site; Aug. 7, 1996; pp. 1-2. 
"Profiting from Chaos. Using Chaos Theory for Market Timing, Stock Selection & 
Option". 

"Shape Analysis", Digital Image Processing, Pratt, Wiley, New York, 1991. 
30 "The QBIC Project", IBM QBIC Almaden web site, home page (pp. 1-2). 

"Virage - Visual Information Retrieval", Virage Incorporated, home page. 
"Virage Products", Virage Incorporated web site; pp. 1-2. 

"Visual Information Retrieval: A Virage Perspective Revision 3", Virage Incorporated 
web site; 1995; pp. 1-13. 
35 "Workshop Report: NSF - ARPA Workshop on Visual Information Management 

Systems", Virage Incorporated web. site; pp. 1-15. 

A. D. Bimbo, et al, "3-D Visual Query Language for Image Databases", Journal Of 
Visual Languages & Computing, 1992, pp. 257-271. 

A. E. Cawkell, "Current Activities in Image Processing Part III: Indexing Image 
40 Collections", CRITique, vol. 4, No. 8, May 1992, pp. 1-11, ALSIB, London. 
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A. Pizano et al, "Communicating with Pictorial Databases", Human-Machine Interactive 
Systems, pp. 61-87, Computer Science Dept, UCLA, 1991. 

A. Yamamoto et al, "Extraction of Object Features from Image and its Application to 
Image Retrieval", IEEE 9th International Conference On Pattern Recognition, 1988, 988-991. 
5 A. Yamamoto et al, "Image Retrieval System Based on Object Features", IEEE 

Publication No. CH2518-9/87/0000-0132, 1987, pp. 132-134. 

A. Yamamoto et al., "Extraction of Object Features and Its Application to Image 
Retrieval", Trans, of IEICE, vol. E72, No. 6, 771-781 (Jun. 1989). 

A. Yamamoto et al., "Extraction of Object Features from Image and Its Application to 
10 Image Retrieval", Proc. 9th Annual Conference on Pattern Recognition, vol. II, pp. 988-991 
(Nov. 1988). 

A. Soffer and H. Samet. Retrieveal by content in symbolic-image databases. In 
Symposium on Electronic Imaging: Science and Technology - Storage & Retrieval for Image and 
Video Databases IV, pages 144 - 155. IS&T/SPIE, 1996. 
15 Abadi, ML, et al, "Authentication and Delegation with Smart-cards", Oct. 22, 1990, 

revised Jul. 30, 1992 Report 67, Systems Research Center, Digital Equipment Corp., Palo Alto, 
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DEMOGRAPHICALLY TARGETED ADVERTISING THROUGH ELECTRONIC MEDIA 
Since the advent of commercially subsidized print media, attempts have been made to 
optimize the placement and compensation aspects relating to commercial messages or 
advertisements in media. In general, advertisers subsidize a large percentage of the cost of mass 
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publications and communications, in return for the inclusion and possibly strategic placement of 
advertisements in the publication. Therefore, the cost of advertising in such media includes the 
cost of preparation of the advertisement, a share of the cost of publication and a profit for the 
content provider and other services. Since the advertiser must bear some of the cost of 
production and distribution of the content, in addition to the cost of advertisement placement 
itself, the cost may be substantial. The advertiser justifies this cost because the wide public 
reception of the advertisement, typically low cost per consumer "impression", with a related 
stimulation of sales due to commercial awareness of the advertisers' products and services. 
Therefore, the advertisement is deemed particularly effective if either the audience is very large, 
with ad response proportionate to the size of the audience, or if it targets a particularly receptive 
audience, with a response rate higher than the general population. 

On the other hand, the recipient of the commercial publication is generally receptive of 
the advertisement, even though it incurs a potential inefficiency in terms of increased data 
content and inefficiencies in receiving the content segment, for two reasons. First, the 
advertisements subsidize the publication, lowering the monetary cost to the recipient. Second, it 
is considered economically efficient for a recipient to review commercial information relating to 
prospective purchases or expenditures, rather than directly soliciting such information from the 
commercial source, i.e., "push" is better than "pull". For this reason specialty publications are 
produced, including commercial messages appropriate for the particular content of the media or 
the intended recipients. In fact, in some forms of publications, most, if not all the information 
content is paid advertisements, with few editorial or independently produced pieces. 

Mass media, on the other hand, tends not to include specialty commercial messages, 
because the interested population is too disperse and the resulting response rate from an 
advertisement too low, and further because the majority of the audience will be disinterested or 
even respond negatively to certain messages. Thus, mass media generally includes a majority of 
retail advertisements, with specialty advertisements relegated, if at all, to a classified section 
which is not interspersed with other content. 

This is the basis for a "least common denominator" theory of marketing, that mass media 
must merchandise to the masses, while specialty media merchandises to selected subpopulations. 
As a corollary, using such types of media, it may be difficult to reach certain specialized 
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populations who do not consistently receive a common set of publications or who receive 
primarily publications which are unspecialized or directed to a different specialty. 

Where a recipient has limited time for reviewing media, he or she must divide his or her 
available time between mass media and specialty media. Alternatively, publication on demand 

.5 services have arisen which select content based on a user's expressed interests. Presumably, 
these same content selection algorithms may be applied to commercial messages. However, 
these- services are primarily limited distribution, and have content that is as variable as 
commercial messages. Likewise, mass media often has regionally variable content, such as local 
commercials on television or cable systems, or differing editions of print media for different 

10 regions. Methods are known for demographic targeting of commercial information to 

consumers; however, both the delivery methods and demographic targeting methods tend to be 
suboptimal. 

Sometimes, however, the system breaks down, resulting in inefficiencies. These result 
where the audience or a substantial proportion thereof is inappropriate for the material presented, 

15 and thus realize a low response rate for an advertiser or even a negative response for the media 
due to the existence of particular commercial advertisers. The recipients are bombarded with 
inappropriate information, while the advertiser fails to realize optimal return on its advertising 
expenditures. In order to minimize the occurrence of these situations, services are available, 
including A.C. Nielsen Co. and Arbitron, Inc., which seek to determine the demographics of the 

20 audience of broadcast media. 

U.S. 5,436,653, incorporated herein by reference, relates to a broadcast segment 
recognition system in which a signature representing a monitored broadcast segment is compared 
with broadcast segment signatures in a data base representing known broadcast segments to 
determine whether a match exists. Therefore, the broadcast viewing habits of a user may be 

25 efficiently and automatically monitored, without pre-encoding broadcasts or the like. 

U.S. 5,459,306. incorporated herein by reference, relates to a method for delivering 
targeting information to a prospective individual user. Personal user information is gathered, as 
well as information on a user's use of a product, correlated and stored. Classes of information 
potentially relevant to future purchases are then identified, and promotions and recommendations 

30 delivered based on the information and the user information. 
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U.S. 5,483,278, incorporated herein by reference, relates to a system having a user 
interface which can access downloaded electronic programs and associated information records, 
and which can automatically correlate the program information with the preferences of the user, 
to create and display a personalized information database based upon the results of the 
5 correlation. Likewise, U.S. 5,223,914, expressly incorporated herein by reference, relates to a 
system and method for automatically correlating user preferences with a T.V. program 
information database. 

U.S. Patent No. 5,231,494, expressly incorporated herein by reference, relates to a system 
which selectively extracts one of a plurality of compressed television signals from a single 
10 channel based on viewer characteristics. 

U.S. Patent No. 5,410,344 relates to a system for selecting video programs based on 
viewers preferences, based on content codes of the programs. 

U.S. 5,485,518, incorporated herein by reference, relates to a system for electronic media 
program recognition and choice, allowing, for example, parental control of the individual 
15 programs presented, without requiring a transmitted editorial code. 

VIDEOCONFERENCING TECHNOLOGIES 

Videoconferencing systems are well known in the art. A number of international 
standards have been defined, providing various telecommunication bandwidth and 

20 communication link options. For example, H.320, H.323 and H.324 are known transport 
protocols over ISDN, packet switched networks and public switched telephone networks, 
respectively. H.324 provides a multimedia information communication and videoconferencing 
standard for communication over the standard "plain old telephone system" network ("POTS"), 
in which the video signal is compressed using DCT transforms and motion compensation for 

25 transmission over a v. 80 synchronous v.34-type modem link. The video image is provided as a 
video window with relatively slow frame rate. This image, in turn, may be presented on a 
computer monitor or television system, with appropriate signal conversion. See, Andrew W. 
Davis, "Hi Grandma!: Is It Time for TV Set POTS Videoconferencing?", Advanced Imaging , pp. 
45-49 (March 1997); Jeff Child, "H.324 Paves Road For Mainstream Video Telephony", 

30 Computer Design . January 1997, pp. 107-110. A newly proposed set of extensions to H.324, 
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called H.324/M, provides compatibility with mobile or impaired telecommunications systems, 
and accommodates errors and distortions in transmissions, reduced or variable transmission rates 
and other anomalies of known available mobile telecommunications systems, such as Cellular, 
GSM, and PCS. 

Four common standards are employed, which are necessary for videoconferencing 
stations to communicate with each other under common standards. The first is called h.320, and 
encompasses relatively high bandwidth systems, in increments of 64 kbits/sec digital 
communication with a synchronous communication protocol. Generally, these systems 
communicate with 128 kbits/sec, 256 kbits/sec or 384 kbits/sec, over a number of "bonded" 
ISDN B-channels. The second standard h.324, employs a standard POTS communication link 
with a v.80/v.34bis modem, communicating at 33.6 kbits/sec synchronous. The third standard, is 
the newly established H.321 standard, which provides for videoconferencing over a packet 
switched network, such as Ethernet, using IPX or TCP/IP. Finally, there are so-called Internet 
videophone systems, such as Intel Proshare. See, Andrew W. Davis, "The Video Answering 
Machine: Intel ProShare's Next Step", Advanced Imag ing, pp. 28-30 (March 1997). 

In known standards-based videoconferencing systems, the image is generally compressed 
using a discrete cosine transform, which operates in the spatial frequency domain. In this 
domain, visually unimportant information, such as low frequencies and high frequency noise are 
eliminated, leaving visually important information. Further, because much of the information in 
a videoconference image is repeated in sequential frames, with possible movement, this 
redundant information is transmitted infrequently and filtered from the transmitted image stream, 
and described with motion vector information. This motion vector information encodes objects 
which are fixed or move somewhat between frames. Such known techniques include H.261, 
with integer pixel motion estimation, and H.263, which provides 1/2 pixel motion estimation. 
Other techniques for video compression are known or have been proposed, such as H.263+ , and 
MPEG-4 encoding. Many standard videoconferencing protocols require the initial transmission 
of a full frame image, in order to set both transmitting and receiving stations to the same 
encoding state. The digital data describing this image is typically Huffman encoded for 
transmission. Multiple frames may be combined and coded as a unit, for example as so-called 
PB frames. Other techniques are also known for reducing image data transmission bandwidth for 
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various applications, including video conferencing. 

Each remote videoconference terminal has an interface system, which receives the digital 
data, and separates the video information (H.261, H.263), audio information (G.711, G.723, 

G. 723.1), data protocol information (HDLC, V.14, LAPM, etc.) and control information (H.245, 

H. 221/H.223) into discrete streams, which are processed separately. Likewise, each terminal 
interface system also assembles the audio information, video information, data protocols and 
control data for transmission. The control information consists of various types of information; 
the standard control protocol which addresses the data format, error correction, exception 
handling, and other types of control; and the multipoint control information, such as which 
remote videoconference terminal(s) to receive audio information from, selective audio muting, 
and such. Generally, the standard, low level control information is processed locally, at the 
codec interface system, and filtered from the remainder of the multipoint control system, with 
only the extracted content information made available to the other stations. 

The ITU has developed a set of multipoint videoconferencing standards or 
recommendations, T.120-T.133, T.RES series, H.231, H.243, etc. These define control schemes 
for multiple party video conferences. Typically, these protocols are implemented in systems 
which either identically replicate the source image data stream from one source to a plurality of 
destinations, or completely decode and reencode the image in a different format in a "transcoder" 
arrangement, to accommodate incompatible conference stations. The ITU standards also allow 
optional data fields which may be used to communicate digital information essentially outside 
the videoconference scheme, and provide data conferencing capabilities, which allow 
videoconferencing and data conferencing to proceed simultaneously. See, ITU T.120-T.127, 
T.130-T.133, T.RES, T.Share and T.TUD recommendations, expressly incorporated herein by 
reference. 

There are a number of known techniques for transmitting and displaying alphanumeric 
data on a television, the most common of which are teletext, used primarily in Europe, and 
closed caption, which is mandated in television sets larger than 13 inches by the Television 
Decoder Circuitry Act of 1990, and Section 305 of the Telecommunications Act of 1996, and 
Federal Communication Commission (FCC) regulations. The American closed caption standard 
is EIA 608. The later is of particular interest because many current generation televisions, 
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especially larger sizes, include a closed caption decoder, and thus require no external hardware or 
connections, separate from the hardware and cabling for supplying the video signal. See, TCC 
Tech Facts, Vols. 1-4, (www.wgbh.org, rev. 9/95) expressly incorporated herein by reference. 
The closed caption signal is distributed on Line 21 of the vertical blanking interval. The existing 
5 standard supports 480 bits/sec, with a potential increase to 9600 bits/sec in the forthcoming 
ATSC standard. 

Known systems provide a videoconferencing system which resides in a "set top box", i.e., 
a stand-alone hardware device suitable for situation on top of a television set, providing all of the 
necessary functionality of a videoconferencing system employing the television as the display 

10 and possibly audio speaker functions. These systems, however, do not integrate the television 
functions, nor provide interaction between the video and videoconferencing systems. C-Phone 
Inc., Wilmington NC, provides a C-Phone Home product line which provides extensions to 
H.324 and/or H.320 communications in a set-top box. 

Other known videophone and videoconferencing devices are disclosed, e.g., in U.S. 

15 Patent Nos. 5,600,646; 5,565,910; 5,564,001; 5,555,443; 5,553,609: 5,548,322; 5,542,102; 
5,537,472; 5,526,405; 5,509,009; 5,500,671; 5,490,208; 5,438,357; 5,404,579; 5,374,952; 
5,224,151; 4,543,665; 4,491,694; 4,465,902; 4,456,925; 4,427,847; 4,414,432; 4,377,729; 
4,356,509; 4,349,701; 4,338,492; 4,008,376 and 3,984,638 each of which is expressly 
incorporated herein by reference. 

20 Known Web/TV devices (from Sony/Magnavox/Philips) allow use of a television to 

display alphanumeric data, as well as audiovisual data, but formats this data for display outside 
the television. In addition, embedded Web servers are also known. See, Richard A. Quinell, 
"Web Servers in embedded systems enhance user interaction", EDN, April 10, 1997, pp. 61-68, 
incorporated herein by reference. Likewise, combined analog and digital data transmission 

25 schemes are also known. See. U.S. Patent No. 5,404,579. 

A class of computing devices, representing a convergence of personal computers and 
entertainment devices, and which provide network access to the Internet (a publicly available 
network operating over TCP/IP). ITU standards for communications systems allow the selective 
addition of data, according to T.120-T.133, T.RES series of protocols, as well as HDLC, V.14, 

30 LAPM, to the videoconference stream, especially where excess bandwidth is available for upload 
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or download. 

A system may be provided with features enabling it to control a so-called smart house 
and/or to be a part of a security and/or monitoring system, with imaging capability. These 
functions are provided as follows. As discussed above, various data streams may be integrated 
5 with a videoconference data stream over the same physical link. Therefore, external inputs and 
outputs may be provided to the videophone or videoconference terminal, which maybe processed 
locally and/or transmitted over the telecommunications link. The local device, in this case, is 
provided with a continuous connection or an autodial function, to create a communications link 
as necessary. Therefore, heating ventilation and air conditioning control (HVAC), lighting, 

10 appliances, machinery, valves, security sensors, locks, gates, access points, etc., may all be 
controlled locally or remotely through interfaces of the local system, which may include logic 
level signals, relays, serial ports, computer networks, fiber optic interfaces, infrared beams, radio 
frequency signals, transmissions through power lines, standard-type computer network 
communications (twisted pair, coaxial cable, fiber optic cable), acoustic transmissions and other 

15 known techniques. Likewise, inputs from various devices and sensors, such as light or optical, 
temperature, humidity, moisture, pressure, fluid level, security devices, radio frequency, acoustic, 
may be received and processed locally or remotely. A video and audio signal transmission may 
also be combined with the data signals, allowing enhanced remote monitoring and control 
possibilities. This information, when transmitted through the telecommunication link, may be 

20 directed to another remote terminal, for example a monitoring service or person seeking to 

monitor his own home, or intercepted and processed at a central control unit or another device. 
Remote events may be monitored, for example, on a closed caption display mode of a television 
attached to a videophone. 

While the preferred embodiments of the invention adhere to established standards, the 

25 present invention also encompasses communications which deviate from or extend beyond such 
standards, and thus may engage in proprietary communications protocols, between compatible 
units. 

OTHER REFERENCES 
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In addition, the following patents are considered relevant to the data compression and 
pattern recognition functions of the apparatus and interface of the present invention and are 



incorporated herein by reference 



U.S. Patent Nos. 3,609,684; 3,849,760;3,950,733; 3,967,241; 
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aforementioned patents, some of which are mentioned elsewhere in this disclosure, and which 
form a part of this disclosure, may be applied in known manner by those skilled in the art in 
order to practice various embodiments of the present invention. 
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The following scientific articles, some of which are discussed elsewhere herein, are 
understood by those skilled in the art and relate to the pattern recognition and image compression 
functions of the apparatus and interface of the present invention: 

"Fractal Geometry-Understanding Chaos", Georgia Tech Alumni Magazine, p. 16 (Spring 

5 1986). 

"Fractal Modelling of Biological Structures", School of Mathematics, Georgia Institute of 
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"Fractal Modelling of Real World Images", Lecture Notes for Fractals: Introduction, 
Basics and Perspectives, Siggraph (1987). 
10 "Fractals Yield High Compression", Electronic Engineering Times, September 30, 1991, 

p. 39. 

"Fractals- A Geometry of Nature", Georgia Institute of Technology Research Horizons, p. 
9 (Spring 1986). 

"Voice Recognition and Speech Processing", Elektor Electronics, Sep. 1985, pp. 56-57. 
15 Aleksander, L, "Guide to Pattern Recognition Using Random-Access Memories", 

Computers and Digital Techniques, 2(l):29-40 (Feb. 1979). 

Anderson, F., W. Christiansen, B. Kortegaard, "Real Time, Video Image Centroid 
Tracker", Apr. 16-20, 1990. 

Anson, L., M. Barnsley, "Graphics Compression Technology", SunWorld, pp. 43-52 
20 (October 1991). 

Appriou, A., "Interet des theories de l'incertain en fusion de donnees", Colloque 
International sur le Radar Paris, 24-28 avril 1989. 

Appriou, A., "Procedure d'aide a la decision multi-informateurs. Applications a la 
classification multi-capteurs de cibles", Symposium de T Avionics Panel (AGARD) Turquie, 
25 25-29 avril 1988. 

Arrow, K. J., "Social choice and individual valves", John Wiley and Sons Inc. (1963). 
Barnsley et al., "A Better Way to Compress Images", Byte Magazine, Jan. 1988. 
Barnsley et al., "Harnessing Chaos For Images Systhesis", Computer Graphics, 22(4) 
(8/1988). 

30 Barnsley et al., "Hidden Variable Fractal Interpolation Functions", School of 

Mathematics, Georgia Institute of Technology, Atlanta, GA. 30332, Jul., 1986. 

Batchelor, B. G., "Pattern Recognition, Ideas in Practice", Plenum Press, London and 
New York, (1978). 

Batchelor, B. G., "Practical Approach to Pattern Classification", Plenum Press, London 
35 and New York, (1974). 

Bellman, R. E., L. A. Zadeh, "Decision making in a fuzzy environment", Management 
Science, 17(4) (December 1970). 

Bhatnagar, R. K., L. N. Kamal, "Handling uncertain information: a review of numeric 
and non-numeric methods", Uncertainty in Artificial Intelligence, L. N. Kamal and J. F. Lemmer, 
40 Eds. (1986). 

Blair, D., R. Pollack, "La logique du choix collectif", Pour la Science (1983). 
Burr, D. J., "A Neural Network Digit Recognizer", Proceedings of the 1986 IEEE 
International Conference of Systems, Man and Cybernetics, Atlanta, Ga., pp. 1621-1625. 
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Caffery, B., "Fractal Compression Breakthrough for Multimedia Applications", Inside, 
October 9, 1991. 

Carpenter, G. A., S. Grossberg, "The Art of Adaptive Pattern Recognition by a 
Self-Organizing Neural Network", IEEE Computer, Mar. 1988, pp. 77-88. 
5 Casasent, D., et al., "General I and Q Data Processing on a Multichannel AO System", 

Applied Optics, 25(18):3217-24 (Sep. 15, 1986). 

Caudill, M., "Neural Networks Primer-Part [IT, AI Expert, Jun. 1988, pp. 53-59. 
Chao, J. J., E. Drakopoulos, C. C. Lee, "An evidential reasoning approach to distributed 
multiple hypothesis detection", Proceedings of the 20th Conference on decision and control, Los 
10 Angeles, Calif., December 1987. 

Chao, T.-H.; Hegblom, E.; Lau, B.; Stoner, W.W.; Miceli, W.J., "Optoelectronicaliy 
implemented neural network with a wavelet preprocessor", Proceedings of the SPIE - The 
International Society for Optical Engineering . 2026:472-82(1993). 

Chen et al., "Adaptive Coding of Monochrome and Color Images", Nov. 1977, pp. 
15 1285-1292. 

Cheong, C.K.; Aizawa, K.; Saito, T.; Hatori, M., "Adaptive edge detection with fractal 
dimension", Transactions of the Institute of Electronics. Information and Communication 
Engineers D-II . J76D-II(ll):2459-63 (1993). 

Computer Visions, Graphics, and Image Processing, 1987, 37:54-115. 
20 Computers and Biomedical Research 5, 388-410 (1972). 

Cooper, L. N., "A Possible Organization of Animal Memory and Learning", Nobel 24, 
(1973), Collective Properties of Physical Systems, pp. 252-264. 

Crawford et al, "Adaptive Pattern Recognition Applied To An Expert System For Fault 
Diagnosis In Telecommunications Equipment", pp. 10/1-8 (Inspec. Abstract No. 86C010699, 
25 Inspec IEE (London) & IEE Coll. on "Adaptive Filters", Digest No. 76, Oct. 10, 1985). 

Danielsson, Erik, et al., "Computer Architectures for Pictorial Inf. Systems", IEEE 
Computer, Nov., 1981, pp. 53-67. 

Dempster, A. P., "A generalization of Bayesian inference", Journal of the Royal 
Statistical Society, Vol. 30, Series B (1968). 
30 Dempster, A. P., "Upper and lower probabilities induced by a multivalued mapping", 

Annals of mathematical Statistics, no. 38 (1967). 

Denker, 1984 International Test Conf., Oct. 1984, Philadelphia, Pa., pp. 558-563. 

Dubois, D., "Modeles mathematiques de l'imprecis et de Tincertain en vue duplications 
aux techniques. d' aide a la decision", Doctoral Thesis, University of Grenoble (1983). 
35 Dubois, D., N. Prade, "Combination of uncertainty with belief functions: a 

reexamination", Proceedings 9th International Joint Conference on Artificial Intelligence, Los 
Angeles (1985). 

Dubois, D., N. Prade, "Fuzzy sets and systems-Theory and applications", Academic 
Press, New York (1980). 
40 Dubois, D., N. Prade, "Theorie des possibilites: application a la representation des 

connaissances en informatique", Masson, Paris (1985). 

Duda, R. O., P. E. Hart, M. J. Nilsson, "Subjective Bayesian methods for rule-based 
inference systems", Technical Note 124, Artificial Intelligence Center, SRI International. 

Dunning, B. B., "Self-Learning Data-Base For Automated Fault Localization", IEEE, 
45 1979, pp. 155-157. 
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Farrelle, Paul M. and Jain, Anil K., "Recursive Block Coding-A New Approach to 
Transform Coding", IEEE Transactions on Communications, Com. 34(2) (Feb. 1986). 

Fitzpatrick, J. M., J. J. Grefenstette, D. Van Gucht, "Image Registration by Genetic 
Search", Conf. Proa, IEEE Southeastcon 1984, pp. 460-464. 
5 Fua, P. V., "Using probability density functions in the framework of evidential reasoning 

Uncertainty in knowledge based systems", B. Bouchon, R. R. Yager, Eds. Springer Verlae 
(1987). 

Gogoussis et al., Proc. SPIE Intl. Soc. Opt. Eng., Nov. 1984, Cambridge, MA., pp 
121-127. " 

10 Grossberg, S., G. Carpenter, "A Massively Parallel Architecture for a Self-Organizing 

Neural Pattern Recognition Machine", Computer Vision, Graphics, and Image Processing, 1987 
37,54-115,252-315. 

Gullichsen, E., E. Chang, "Pattern Classification by Neural Network: An Experiment 
System for Icon Recognition", ICNN Proceeding on Neural Networks, Mar. 1987, pp. 
15 IV-725-32. 

Haruki, K. et al., "Pattern Recognition of Handwritten Phonetic Japanese Alphabet 
Characters", International Joint Conference on Neural Networks, Washington, D.C., Jan. 1990, 
pp. 11-515 to 11-518. 

Hayashi, Y., et al., "Alphanumeric Character Recognition Using a Connectionist Model 
20 with the Pocket Algorithm", Proceedings of the International Joint Conference on Neural 
Networks, Washington, D.C. Jun. 18-22, 1989, vol. 2, pp. 606-613. 

Hayes, H.I.; Solka, J.L.; Priebe, C.E.; "Parallel computation of fractal dimension", 
Proceedings of the SPIE - The I nternation a l Society for O ptical Engineering . 1962:219-30 
(1993). 

25 Hinton et al., "Boltzmann Machines: Constraint Satisfaction Networks that Learn", Tech. 

Report CMU-CS-85-1 19, Carnegie-Mellon Univ, 5/84. 

Hoare, F.; de Jager, G., "Neural networks for extracting features of objects in images as a 
pre-processing stage to pattern classification", Proceedings of the 1992 South African 
Symposium on Communications and Signal Processing. COMSIG '92 (Cat. No.92TH0482-0). 
30 Inggs, M. (Ed.), p. 239-42 (1992). 

Hopfield et al., "Computing with Neural Circuits: A Model". Science. 233:625-633 (8 
Aug. 1986). 

Hopfield, "Neural Networks and Physical Systems with Emergent Collective 
Computational Abilities", Proc. Natl. Acad. Sci. USA, 79:2554-2558 (Apr. 1982). 
35 Hopfield, "Neurons with graded response have collective computational properties like 

those of two-state neurons", Proc. Natl. Acad. Sci. USA, 81:3088-3092 (May 1984). 

Hurtgen, B.; Buttgen, P., "Fractal approach to low rate video coding", Proceedings of the 
SPIE - The International Society for Optical Engineering . 2094 (pt. 1 ): 1 20-3 1 (1993) . 

Information Processing 71, North-Holland Publishing Company (1972) pp. 1530-1533. 
40 Ishizuka, M., "Inference methods based on extended Dempster and Shafer's theory for 

problems with uncertainty/fuzziness", New Generation Computing, Ohmsha, Ltd, and Springer 
Verlag, 1:159-168 (1983). 

Jackel, L. D., H. P. Graf, J. S. Denker, D. Henderson and I. Guyon, "An Application of 
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should be realized that the hardware for implementing a system may be integrally related to the 
choice of specific method or software algorithm for implementing the system, and therefore these 
together form a system. It is noted that in view of the present disclosure, it is within the skill of 
the artisan to combine in various fashions the available methods and apparatus to achieve the 
5 advanced interface and control system of the present invention. 
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SUMMARY AND OBJECTS OF THE INVENTION 
The present invention provides, according to one embodiment, an adaptive user interface 
which changes in response to the context, past history and status of the system. The strategy 
employed preferably seeks to minimize, for an individual user at any given time, the search and 
5 acquisition time for the entry of data through the interface. 

The interface may therefore provide a model of the user, which is employed in a 
predictive algorithm. The model parameters may be static (once created) or dynamic, and may 
be adaptive to the user or alterations in the use pattern. 

The present invention also provides a model-based pattern recognition system, for 
10 determining the presence of an object within an image. By providing models of the objects 
within an image, the recognition process is relatively unaffected by perspective, and the 
recognition may take place in a higher dimensionality space than the transmitted media. Thus, 
for example, a motion image may include four degrees of freedom; x, y, chroma/luma, and time. 
A model of an object may include further dimensions, including z, and axes of movement. 
15 Therefore, the model allows recognition of the object in its various configurations and 
perspectives. 

A major theme of the present invention is the use of intelligent, adaptive pattern 
recognition in order to provide the operator with a small number of high probability choices, 
which may be complex, without the need for explicit definition of each atomic instruction 

20 comprising the desired action. The interface system predicts a desired action based on the user 
input, a past history of use, a context of use, and a set of predetermined or adaptive rules. 

Because the present invention emphasizes adaptive pattern recognition of both the input 
of the user and data which may be available, the interface system proposes the extensive use of 
advanced signal processing and neural networks. These processing systems may be shared 

25 between the interface system and the functional system, and therefore a controller for a complex 
system may make use of the intrinsic processing power available rather than requiring additional 
computing power, although this unification is not required. In the case where the user interface 
employs common hardware elements, it is further preferred that the interface subsystem employ 
common models of the underlying data structures on which the device functionally operates. 
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In fact, while hardware efficiency dictates common hardware for the interface system and 
the operational routine, other designs may separate the interface system from the operational 
system, allowing portability and efficient application of a single interface system for a number of 
operational systems. Thus, the present invention also proposes a portable human interface 
system which may be used to control a number of different devices. In this case, a web browser 
metaphor is preferred, as it has become a standard for electronic communications. 

A portable interface may, for example, take the form of a personal digital assistant or 
downloaded JAVA applet, with the data originating in a web server. The data from a web server 
or embedded web server may include a binary file, a generic HTML/XML file, or other data 
type. The interface receives the data and formats it based, at least in part, on parameters specific 
to the client or user. Thus, the presentation of data is responsive to the user, based on user 
preferences, as opposed to hardware limitations or compatibility issues. In a preferred 
embodiment, the data is transmitted separately from the presentation definition. The presentation 
definition, on the other hand, provides a set of parameters that propose or constrain the data 
presentation. The user system also provides a set of parameters that set preferences on 
presentation. Further, the data itself is analyzed for appropriate presentation parameters. These 
three sets of considerations are all inputs into a "negotiation" for an ultimate presentation 
scheme. Thus, the presentation is adaptive to server parameters, user parameters, and the data 
itself. For example, in a typical web-context, the color, size, typestyle, and layout of text may be 
modified based on these considerations. Other factors that may be altered include frame size and 
layout, size of hotspots, requirement for single or double clicks for action, and the like. 

The adaptive nature of the present invention derives from an understanding that people 
learn most efficiently through the interactive experiences of doing, thinking, and knowing. For 
ease-of-use, efficiency, and lack of frustration of the user, the interface of the device should be 
intuitive and self explanatory, providing perceptual feedback to assist the operator in 
communicating with the interface, which in turn allows the operational system to receive a 
description of a desired operation. Another important aspect of man-machine interaction is that 
there is a learning curve, which dictates that devices which are especially easy to master become 
frustratingly elemental after continued use, while devices which have complex functionality with 
many options are difficult to master and may be initially rejected, or the user stops exploring. 



Hoffberg et al. 



-65- 



LIH-10 



One such system which addresses this problem is U.S. Patent No. 5,005,084, expressly 
incorporated herein by reference. The present invention addresses these issues by determining 
the most likely instructions of the operator, and presenting these as easily available choices, by 
analyzing the past history data and by detecting the "sophistication" of the user in performing a 
function, based on all information available to it. The context of use may also be a significant 
factor. The interface seeks to optimize the relevant portion of the interface adaptively and 
immediately in order to balance and optimize the interface for both quantitative and qualitative 
factors. This functionality may greatly enhance the quality of interaction between man and 
machine, allowing a higher degree of overall system sophistication to be tolerated and a greater 
value added than other interface designs. See, Commaford, C, "User-Responsive Software Must 
Anticipate Our Needs", PC Week, May 24, 1993. 

The present interface system analyzes data from the user, which may be both the 
selections made by the user in context, as well as the efficiency by which the user achieves the 
selection. Thus, information concerning both the endpoints and time-dependent path of the 
process are considered and analyzed by the interface system. 

The interface of the present invention may be advantageously applied to an operational 
system that has a plurality of functions, certain of which are unnecessary or are rarely used in 
various contexts, while others are used with greater frequency. In such systems, the functionality 
use is usually predictable. Therefore, the present invention provides an optimized interface 
system which, upon recognizing a context, dynamically reconfigures the availability or ease of 
availability of functions and allow various subsets to be used through "shortcuts". The interface 
presentation will therefore vary over time, use and the particular user. 

The advantages to be gained by using an intelligent data analysis interface for facilitating 
user control and operation of the system are more than merely reducing the average number of 
selections or time to access a given function. Rather, advantages also arise from providing a 
means for access and availability of functions not necessarily previously existing or known to the 
user, therefore improving the perceived quality and usefulness of the product. Further 
advantages over prior interfaces accrue due to the availability of pattern recognition functionality . 
as a part of the interface system. 
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In those cases where the pattern recognition functions are applied to large amounts of 
data or complex data sets, in order to provide a sufficient advantage and acceptable response 
time, powerful computational resources, such as advanced DSPs or neural network processors 
are made available to the interface system. On the other hand, where the data is simple or of 
limited scope, aspects of the technology may be easily implemented as added software 
functionality as improvements of existing products having limited computational resources. 

The application of these technologies to multimedia systems provides a new model for 
performing image pattern recognition on multimedia data and for the programming of 
applications including such data. The ability of the interface of the present invention to perform 
abstractions and make decisions regarding a closeness of presented data to selection criteria 
makes the interface suitable for use in a programmable control, i.e., determining the existence of 
certain conditions and taking certain actions on the occurrence of detected events. Such 
advanced technologies might be especially valuable for disabled users. 

In a multimedia environment, a user often wishes to perform an operation on a 
multimedia data event. Past systems have required explicit indexing of images and events. The 
present technologies, however, allow an image, diagrammatic, abstract or linguistic description 
of the desired event to be acquired by the interface system from the user and applied to identify 
or predict the multimedia event(s) desired without requiring a separate manual indexing or 
classification effort. These technologies may also be applied to single media data. 

The interface system according to the present invention is not limited to a single data 
source, and may analyze data from many different sources for its operation. This data may be 
stored data or present in a data stream. Thus, in a multimedia system, there may be a real-time 
data stream, a stored event database, as well as an exemplar or model database. Further, since 
the device is adaptive, information relating to past experience of the interface, both with respect 
to exposure to data streams and user interaction, is also stored. This data analysis aspect of the 
operation of the present interface system may be substantially processor intensive, especially 
where the data includes abstract or linguistic concepts or images to be analyzed. Interfaces 
which do not relate to the processing of such data may be implemented on simpler hardware. On 
the other hand, systems which handle complex data types may necessarily include sophisticated 
processors, adaptable for use with the interface system, thus minimizing the additional 
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computing power necessary in order to implement the interface according to the present 
invention. A portion of the data analysis may also overlap the functional analysis of the data for 
operation. 

A fractal-based image processing system exemplifies one application of the technologies. 
5 A fractal-based system includes a database of image objects, which may be preprocessed in a 
manner which makes them. suitable for comparison to a fractal-transformed image representation 
of an image to be analyzed. Thus, corresponding "fractal" transforms are performed on the 
unidentified image or a portion thereof and on an exemplar of a database. A degree of 
reiatedness is determined in this "fractal transform domain", and the results used to identify 

10 objects within the image. The system then makes decisions based on the information content of 
the image, i.e. the objects contained therein. 

The fractal-based image processing system presents many advantages. First, fractal- 
processed images may have dramatically reduced storage size requirements as compared to 
traditional methods while substantially retaining information important for image recognition. 

15 The process may be parallelized, and the exemplars may be multidimensional, further facilitating 
the process of identifying a two-dimensional projection of an object. The efficient storage of 
information allows the use of inexpensive storage media, i.e., CD-ROM, or the use of an on-line 
database through a serial data link, while allowing acceptable throughput. See, Zenith Starsight 
Telecast brochure, (1994); U.S. Patent No. 5,353,121, expressly incorporated herein by 

20 reference. 

As applied to a multimedia database storage and retrieval system, the user programs, 
through an adaptive user interface according to the present invention, the processing of data, by 
defining a criteria and the actions to be taken based on the determination of the criteria. The 
criteria, it is noted, need not be of a predefined type, and in fact this' is a particular feature of the 

25 present invention. A pattern recognition subsystem is employed to determine the existence of 
selected criteria. To facilitate this process, a database of image objects may be stored as two 
counterparts: first, the data is stored in a compressed format optimized for normal use, such as 
human viewing on a video monitor, using, e.g., MPEG-2 or Joint Photographic Experts Group 
(JPEG) compression; second, it is stored in a preprocessed and highly compressed format 

30 adapted to be used with the pattern recognition system. Because the preprocessed data is highly 
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compressed and used directly by the pattern recognition system, great efficiencies in storage and 
data transmission are achieved. The image preprocessing may include Fourier, DCT, wavelet, 
Gabor, fractal, or model-based approaches, or a combination thereof. 

The potential significant hardware requirement for image processing and pattern 
5 recognition is counterbalanced by the enhanced functionality available by virtue of the 

technologies. When applied to multimedia devices, the interface system allows the operator to 
define complex criteria with respect to image, abstract or linguistic concepts, which would 
otherwise be difficult or impossible to formulate. Thus, the interface system becomes part of a 
computational system that would otherwise be too cumbersome for use. It is noted that, in many 

10 types of media streams, a number of "clues" are available defining the content, including close 
caption text, electronic program guides, simulcast data, related Internet web sites, audio tracks, 
image information, and the like. The latter two data types require difficult processing in order to 
extract a semantic content, while the former types are inherently semantic data. 

A pattern recognition subsystem allows a "description" of an "event" without explicit 

15 definition of the data representing the "event". Thus, instead of requiring explicit programming, 
an operator may merely define parameters of the desired "event". This type of system is useful, 
for example, where a user seeks a generic type of data representing a variety of events. This 
eliminates the need for preindexing or standardized characterization of the data. The interface 
system therefore facilitates the formulation of a request, and then searches the database for data 

20 which corresponds to the request. Such preindexing or standardized characterization is 

extremely limiting with image and multimedia data, because "a picture is worth a thousand 
words", and without a priori knowing the ultimate search criteria, all possible criteria must be 
accounted for. Pattern recognition systems do not require initial translation of visual aspects into 
linguistic concepts, thus allowing broader searching capability. Of course, a pattern recognition 

25 system may be used in conjunction with other searching schemes, to mutual advantage. 

The pattern recognition functionality of the interface system is not limited to multimedia 
data, and may be applied to data of almost any type, e.g., real-time sensor data, distributed 
control, linguistic data, etc. 

It is noted that, in consumer electronics and particularly entertainment applications, the 

30 reliability of the system need not be perfect, and errors may be tolerable. On the other hand, in 
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industrial control applications, reliability must be much higher, with fail-safe backup systems in 
place, as well as advanced error checking. One way to address this issue is to allow the advanced 
user interface to propose an action to the user, without actually implementing the action. 
However, in this case, the action and its proposed basis are preferably presented to the user in a 
sophisticated manner, to allow the basis for the action to be independently assessed by the user. 
Therefore, in a complex, multistep process, the user interface may be simplified by permitting a 
three step process: the user triggers a proposed response, analyzes the proposal and rationale, 
and confirms the proposal. Therefore, single step processes are inferior candidates for intelligent 
assistance. 

Another notable aspect of the technologies is the contextual analysis. Often, multimedia 
data often includes a data component that closely corresponds to a format of a search criteria. 
Thus, while a search may seek a particular image, other portions of the datastream correlate well 
with the aspect of the image being searched, and may be analyzed by proxy, avoiding the need 
for full image analysis. The resulting preselected reduced number of images may then be fully 
analyzed, if necessary. Thus, especially as with respect to consumer electronics applications, 
where absolute accuracy may not be required, the processing power available for pattern 
recognition need not be sufficient for compete real-time signal analysis of all data. The present 
invention therefore proposes use of a variety of available data in order to achieve the desired 
level functionality at minimum cost. 

One aspect of the present invention therefore relates to a mechanism for facilitating a user 
interaction with a programmable device. The interface and method of use of the present 
invention serves to minimize the learning and searching times, better reflect users' expectations, 
provide better matching to human memory limits, be usable by both novices and experienced 
users, reduce intimidation of novice users by the device, reduce errors and simplify the entering 
of programming data. The present invention optimizes the input format scheme for 
programming an event-driven device, and can also be applied to many types of programmable 
devices. Thus, certain human factors design concepts, heretofore unexploited in the design of 
consumer electronics devices and industrial controls, have been incorporated, and new precepts 
developed. Background and theory of various aspects of the present invention is disclosed in 
"AN IMPROVED HUMAN FACTORED INTERFACE FOR PROGRAMMABLE DEVICES: 
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A CASE STUDY OF THE VCR", Master's Thesis, Tufts University (Master of Sciences in 
Engineering Design, November, 1990, publicly available January. 1991). by Linda I. Hoffberg. 
This thesis, and cited references, are incorporated herein by reference, and attached hereto as an 
appendix. Also referenced are: Hoffberg, Linda I., "Designing User Interface Guidelines For 
Time-Shift Programming of a Video Cassette Recorder (VCR)", Proc. of the Human Factors Soc. 
35th Ann. Mtg. pp. 501-504 (1991); and Hoffberg, Linda I., "Designing a Programmable 
Interface for a Video Cassette Recorder (VCR) to Meet a User's Needs", Interface 91 pp. 346-351 
(1991). See also, U.S. Patent Application No. 07/812,805, filed December 23, 1991, 
incorporated herein by reference in its entirety, including appendices and incorporated 
references. 

The present invention extends beyond simple predictive schemes which present 
exclusively a most recently executed command or most recently opened files. Thus, the possible 
choices are weighted in a multifactorial method, e.g., history of use, context and system status, 
rather than a single simple criterion alone. Known simple predictive criteria often exclude 
choices not previously selected, rather than weighing these choices in context with those which 
have been previously selected. While the system according to the present invention may include 
initial weightings, logical preferences or default settings, through use, the derived weightings are 
obtained adaptively based on an analysis of the status, history of use and context. It is noted that 
not all of the possible choices need be weighted, but rather merely a subset thereof. 

For a given system, status, history of use and context may be interrelated factors. For 
example, the status of the machine is determined by the prior use, while the status also intersects 
context. The intended meaning of status is information relating to a path independent state of the 
machine at a given point in time. History of use is intended to implicate more than the mere 
minimum instructions or actions necessary to achieve a given state, and therefore includes 
information unnecessary to achieve a given state, i.e., path dependent information. Context is 
also related to status, but rather is differentiated in that context refers to information relating to 
the environment of use, e.g., the variable inputs or data upon which the apparatus acts or 
responds. Status, on the other hand, is a narrower concept relating more to the internal and 
constant functionality of the apparatus, rather than the particularities of its use during specific 
circumstances. 
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U.S. Patent No. 5,187,797 relates to a machine interface system having hierarchical 
menus, with a simple (three button) input scheme. The choice(s) presented relate only to the 
system status, and not the particular history of use employed to obtain the system status nor the 
context of the choice. This system has a predetermined hierarchical menu structure, which is 
invariant with usage. The goal of this interface system is not to provide a learning interface, but 
rather to teach the user about or conform the user to the dictates of the predetermined and 
invariant interface of the device. While many types of programmable devices are known to exist, 
normally, as provided in U.S. Patent No. 5,187,797, instructions are entered and executed in a 
predetermined sequence, with set branch points based on input conditions or the environment. 
See also U.S. Patent Nos. 4,878,179, 5,124,908, and 5,247,433. 

An aspect of the present invention provides a device having a predetermined or a generic 
style interface upon initial presentation to the user, with an adaptive progression in which 
specialized features become more easily available to a user who will likely be able to make use 
of them, while unused features are or remain "buried" within the interface. The interface also 
extracts behavioral information from the user and to alter the interface elements to optimize the 
efficiency of the user. 

A videocassette recorder is a ubiquitous example of a programmable device, and 
therefore forms the basis of much of the discussion herein. It should, of course, be realized that 
many of the aspects of the present invention could be applied by one of ordinary skill in the art to 
a variety of controls having human interfaces, and that these other applications are included 
within the scope of the present invention. 

The VCR apparatus typically involves a remote control entry device, and the interface of 
the present invention contains a graphical interface displayed for programming programmable 
devices. This aspect of the present invention seeks more accurate programming through the use 
of program verification to ensure that the input program is both valid and executable. Thus, it 
has a mechanism to store and check to verify that there are no conflicting programs. An 
apparatus according to the present invention can be connected, for example, to any infrared 
programmable device in order to simplify the programming process. By way of example only, 
an improved VCR interface forms the basis of a disclosed example. It is, of course, realized that 
the present method and apparatus may be applied to any programmable controller, i.e., any 
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device which monitors an event or sensor and causes an event when certain conditions or 
parameters are met, and may also be used in other programming environments, which are not 
event driven. While the present interface is preferably learning and adaptive, it may also detect 
events and make decisions based on known or predetermined characteristics. Where a number of 
criteria are evaluated for making a decision, conflicts among the various criteria are resolved 
based on a strength of an evaluated criteria, a weighting of the criteria, an interactivity function 
relating the various criteria, a user preference, either explicitly or implicitly determined, and a 
contextual analysis. Thus, a user override or preference input may be provided to assist in 
resolving conflicts. 

The present invention may incorporate an intelligent program recognition and 
characterization system, making use of any of the available cues, which allows an intelligent 
determination of the true nature of the broadcast and therefore is able to make a determination of 
whether parameters should be deemed met even with an inexact match to the specified 
parameters. Therefore, in contradistinction with VPV, the present invention provides, for 
example, intelligence. The VPV is much more like the "VCR Plus" device, known to those 
skilled in the art, which requires that a broadcast be associated with a predetermined code, with 
the predetermined code used as a criteria for initiating recording. Some problems with VCR Plus 
include identification of the codes which identify channel and time, post scheduling changes, 
incorrect VCR clock setting, and irregular schedules. VCR Plus also is limiting with respect to 
new technologies and cable boxes. 

The videotext signal of the prior art includes a digitally encoded text message that may be 
displayed in conjunction with the displayed image, similar to the closed caption system. The 
aforementioned West German system demonstrates one way in which the transmitted signal may 
be received by a device and interpreted to provide useful information other than the transmitted 
program itself. However, the prior art does not disclose how this signal may be used to index 
and catalog the contents of a tape, nor does it disclose how this signal may be used to classify or 
interpret the character of the broadcast. In other words, in one embodiment of the present 
invention, the videotext or closed caption signal is not only interpreted as a literal label, as in the 
prior art, but is also further processed and analyzed to yield data about the content of the 
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broadcast, other than merely an explicit identification of the simultaneously broadcast 
information. 

Beyond or outside the visible region of an U.S. National Television Standards Committee 
(NTSC) broadcast video frame are a number of scan lines which are dedicated to presenting 
5 digital information, rather than analog picture information. Various known coding schemes are 
available for transmitting and receiving information in this non-viewing portion of the video 
transmission, and indeed standard exist defining the content of these information fields. Of 
course, various other transmission schemes provide a format for transmitting data. For example, 
standard frequency modulation (FM) transmissions may be associated with digital data 
10 transmissions in a subcarrier. Likewise, satellite transmissions may include digital data along 
with an audio data stream or within a video frame, which may be in analog format or digitally 
encoded. 

Cable systems may transmit information either in the broadcast band or in a separate 
band. HDTV schemes also generally provide for the transmission of digital data of various sorts. 

15 Thus, known audio and video transmission systems may be used, with little or no modifications 
to provide enhanced functionality, according to the present invention. It is therefore possible to 
use known and available facilities for transmitting additional information relating to the 
broadcast information, in particular, the characteristics of the video broadcast, and doing so could 
provide significant advantages, used in conjunction with the interface and intelligent pattern 

20 recognition controller of the present invention. If this information were directly available, there 
would be a significantly reduced need for advanced image recognition functions, such advanced 
image recognition functions requiring costly hardware devices, while still maintaining the 
advantages of the present invention. 

It is noted, however, that the implementation of a system in which characterization data 

25 of the broadcast is transmitted along therewith might require a new set of standards and the 
cooperation of broadcasters, as well as possibly the government regulatory and approval 
agencies. The present invention does not require, in all of its aspects, such standardization, and 
therefore may advantageously implement substantial data processing locally to the receiver. It is 
nevertheless within the scope of the invention to implement such a broadcast system with 

30 broadcast of characterization data in accordance with the present invention. Such broadcast 
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characterization data may include characterizations as well as preprocessed data useful for 
characterizing according to flexible criteria in the local receiving device. 

According to the present invention, if such characterizations are broadcast, they may, as 
stated above, be in band or out of band, e.g., making use of unused available spectrum bandwidth 
within the NTSC channel space, or other broadcast system channel space, or may be "simulcast" 
on a separate channel, such as an FM sideband or separate transmission channel. Use of a 
separate channel would allow a separate organization, other than the network broadcasters, to 
provide the characterization data for distribution to users of devices that make use of the present 
intelligent system for controlling a VCR or other broadcast information processing device. Thus, 
the characterization generating means need not be directly linked to the local user machine in 
order to fall within the scope of the present invention. The present invention also provides a 
mechanism for copyright holders or other proprietary interests to be protected, by limiting access 
to information be encryption or selective encryption, and providing an accounting system for 
determining and tracking license or broadcast fees. 

Research has been performed relating to VCR usability, technology, implementation, 
programming steps, current technology, input devices, and human mental capacity. This 
research has resulted in a new paradigm for the entry of programming data into a sequential 
program execution device, such as a VCR, by casual users. 

Four major problems in the interfaces of VCRs were found to exist. The first is that users 
spend far too much time searching for necessary information, which is necessary in order to 
complete the programming process. Second, many people do not program the VCR to record at 
a later time (time-shift) frequently, and thus forget the programming steps in the interim, i.e., the 
inter-session decay of the learning curve is significant. Third, the number of buttons on many 
remote control devices has become overwhelming. Fourth, people have become reluctant to 
operate or program VCRs because of their difficult operation. It was found that, by minimizing 
the learning and searching times, the user's programming time and frustration level can be greatly 
reduced. If VCRs are easier to program, users might program them more frequently. This would 
allow more efficiency and flexibility in broadcast scheduling, especially late night for time shift 
viewing. The present invention therefore provides an enhanced VCR programming interface 
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having a simplified information structure, an intuitive operational structure, simplified control 
layout and enhanced automated functionality. 

A new class of consumer device has been proposed, which replaces the videotape of a 
traditional videotape recorder with a random-access storage device, such as a magnetic hard disk 
drive. Multimedia data is converted through a codec (if necessary), and stored in digital form. 
Such systems are proposed by Tivo, Inc., Philips Electronics (Personal TV), Replay Networks, 
Inc. and Metabyte, Inc. Some of these systems employ a user preference based 
programming/recording method similar to that of the present invention. 

In these systems, typically a content descriptive data stream formulated by human editors 
accompanies the broadcast or is available for processing and analysis. Based on a relation of the 
user preferences, which may be implied by actual viewing habits or input through simple 
accept/veto user feedback, selected media events may be recorded. However, such systems rely 
on a correspondence between the factors of interest to users and those encoded in the data stream, 
e.g., a "program guide". This is not always the case. However, where the available data 
describing the program maps reasonably well into the user preference space, such a system may 
achieve acceptable levels of performance, or stated otherwise, the program material selected by 
the system will be considered acceptable. 

One particular aspect of these time-shifting consumer media recording devices is how 
they deal with advertising materials which accompany program material. In many instances, the 
user seeks to avoid "commercials", and the device may be programmed to oblige. However, as 
such devices gain wider acceptance, advertisers will be reluctant to subsidize broadcasts. 
Therefore, an advertising system may be integrated into the playback device which seeks to 
optimize the commercial messages presented to a viewer. By optimizing the messages or 
advertisements, the viewer is more receptive to the message, and economic implications ensue. 
For example, a viewer may be compensated, directly or indirectly, for viewing the commercials, 
which may be closely monitored and audited, such as by taking pictures of the audience in front 
of a "set-top box 1 '. The acquired data, including viewer preferences, may be transmitted back to 
commercial sponsors, allowing detailed demographic analysis. 

In order to ensure privacy, the preference information and/or images may be analyzed by 
a proxy, with the raw data separated from the commercial users of such data. Thus, for example, 
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the particular users of a system may register their biometric characteristics, e.g., face. Thereafter, 
the imager captures facial images and correlates these with its internal database. The image itself 
therefore need not be stored or transmitted. Viewer preferences and habits, on the other hand, 
likely must be transmitted to a central processing system for analysis. 
5 Because the system is intelligent, copy protection and royalty accounting schemes may 

readily be implemented. Thus, broadcasters and content providers may encode broadcasts in 
such a way as to control the operation of the consumer device. For example, an IEEE-1394-type 
encryption key support/copy protection or DIVX scheme may be implemented. Further, certain 
commercial sponsors may be able to avoid deletion of their advertisement, while others may 

10 allow truncation. The acceptability of this to the consumer may depend on subsidies. In other 
words, an company is willing to pay for advertising. Instead of paying for placements directly to 
the media, a portion is paid to a service provider, based on consumer viewing. The media, on the 
other hand, may seek to adopt a pay-per-view policy, at least with respect to the service provider, 
in lieu of direct advertising revenues. The service provider will account to both advertisers and 

15 content providers for use. With sufficient viewing of commercials, the entire service charge for a 
system might be covered for a user. On the other hand, a viewer might prefer to avoid all 
commercials, and not get the benefit of a subsidy. The service provider performs the 
economically efficient function of delivering optimized, substituted commercials for the almost 
random commercials which flood the commercial broadcast networks, and thus can accrue 

20 greater profits, even after paying content providers a reasonable fee. An advertiser, by selecting 
a particular audience, may pay less than it would otherwise pay to a broadcaster. The content 
providers may also charge more for the privilege of use of their works. 

As stated above, the content may be copy protected by the use of encryption and/or 
lockout mechanisms. Thus, by providing an alternative to an analog VCR, a full end-to-end 

25 encrypted signal may be provided, such as that proposed for the IEEE-1394 copy protection 

scheme. Because enhanced recording capabilities are provided to the consumer, the acceptance 
will be high. Because of the encryption, lack of portability and continued royalty accounting, 
content provider acceptance will also likely be high. 

The user interface concepts according to the present invention are easily applied to other 

30 special purpose programmable devices, and also to general purpose programmable devices 
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wherein the programming paradigm is event-driven, as well as other programming systems. It 
should also be noted that it is within the scope of the present invention to provide an improved 
interface and programming environment for all types of programmable devices, and in this 
regard, the present invention incorporates adaptive features which optimize the programming 
environment for both the level of the user and the task to be programmed. 

In optimizing the interface, four elements are particularly important: the input device, the 
display format, the sequence of the programming operation, and the ability of the device to 
properly interpret the input as the desired program sequence. 

The present invention proceeds from an understanding that an absence of user frustration 
with respect to a programmable consumer or industrial device or interface, may be particularly 
important with respect to achieving the maximum potential functionality thereof. The interface 
must be designed to minimize the user's frustration level. This can be accomplished by clearly 
furnishing the possible choices, presenting the data in a logical sequence, and leading the user 
through the steps necessary to program the device. 

When applied to other than audiovisual and/or multimedia application, the pattern 
recognition function may be used to control the execution of a program or selectively control 
execution of portions of the software. For example, in a programmable temperature controller 
application, a sensor or sensor array could be arranged to detect a "door opening". On the 
occurrence of the door opening, the system would recognize this pattern, i.e. a mass of air at a 
different temperature entering the environment from a single location, or a loss of climate 
controlled air through a single location. In either event, the system would take appropriate 
action, including: halt of normal climate control and impose a delay until the door is closed; 
after closure, set a time constant for maintenance of a steady state of the replaced air with the 
climate controlled air; based on the actual climatic condition after assimilation, or a predicted 
climatic condition after assimilation, begin a climate compensation control: optionally, during 
the door opening, control a pressure or flow of air to counterbalance the normal flow through the 
door, by using a fan or other device. The climate may differ in temperature, humidity, pollutants, 
or the like, and appropriate sensors may be employed. 

The present invention also allows a dynamic user preference profile determination based 
on explicit or implicit desires, e.g., moods, which assist in processing data to make decisions 
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which conform to the user preference at a given point in time. For example, voice patterns, skin 
temperature, heat pulse rate, external context, skin resistance (galvanic skin response), blood 
pressure, stress, as determined by EMG, EEG or other known methods, spontaneous motor 
activity or twitching, may be detected in order to determine or infer a user mood, which may be 
used as a dynamic influence on the user preference. These dynamic influences are preferably 
stored separately from static influences of the preferences, so that a resultant determined 
preference includes a dynamic influence based on a determined mood or other temporally 
varying factor and a static influence associated with the user. 

When a group of people are using the system simultaneously, the system must make a 
determination of a composite preference of the group. In this case, the preferences of the 
individuals of the group, if known, may be correlated to produce an acceptable compromise. 
Where individual preferences are not a priori known, individual or group "interviews" may be 
initially conducted to assist in determining the best composite group preference. 

It is therefore an object according to the present invention to provide a radio receiver or 
video receiver device, having a plurality of different available program sources, determining a 
program preference for one or more individuals subject to a presented program, comparing the 
determined program preference and a plurality of different program sources, and selects at least 
one program based on the comparison. 

In formulating a group preference, individual dislikes may be weighted more heavily than 
likes, so that the resulting selection is tolerable by all and preferable to most group members. 
Thus, instead of a best match to a single preference profile for a single user, a group system 
provides a most acceptable match for the group. It is noted that this method is preferably used in 
groups of limited size, where individual preference profiles may be obtained, in circumstances 
where the group will interact with the device a number of times, and where the subject source 
program material is the subject of preferences. Where large groups are present, demographic 
profiles may be employed, rather than individual preferences. Where the device is used a small 
number of times by the group or members thereof, the training time may be very significant and 
weigh against automation of selection. Where the source material has little variety, or is not the 
subject of strong preferences, the predictive power of the device as to a desired selection is 
limited. 
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The present invention provides a system and method for making use of the available 
broadcast media forms for improving an efficiency of matching commercial information to the 
desires and interests of a recipient, improving a cost effectiveness for advertisers, improving a 
perceived quality of commercial information received by recipients and increasing profits and , 
reducing required information transmittal by publishers and media distribution entities. 

This improved advertising efficiency is accomplished by providing a system for collating 
a constant or underlying published content work with a varying, demographically or otherwise 
optimized commercial information content. This commercial information content therefore need 
not be predetermined or even known to the publisher of the underlying works, and in fact may be 
determined on an individual receiver basis. It is also possible to integrate the demographically 
optimized information within the content. For example, overlays in traditional media, and 
electronic substitutions or edits in new media, may allow seamless integration. The content 
alteration need not be only based on commercial information, and therefore the content may vary 
based on the user or recipient. 

U.S. Patent No. 5,469,206, expressly incorporated herein by reference, relates to a system 
that automatically correlates user preferences with electronic shopping information to create a 
customized database for the user. 

Therefore, the granularity of demographic marketing may be very fine, on a receiver-by- 
receiver basis. Further, the accounting for advertisers will be more accurate, with a large sample 
and high quality information. In fact, in a further embodiment, an interactive medium may be 
used allowing immediate or real time communication between recipient and advertiser. This 
communication may involve the Internet, private networks or dial-up connections. Because the 
commercial messages are particularly directed to recipients, communication with each selected 
recipient is more valuable to an advertiser and that advertiser is willing to pay more for 
communication with each selected recipient- Recipients may therefore be selected to receive the 
highest valued appropriate commercial message(s). Thus, advertisers will tend to pay less and 
media producers will gain more revenues. Recipients will gain the benefit of selected and 
appropriate media, and further, may provide feedback for determining their preferences, which 
will likely correspond with their purchasing habits. Thus, the recipient will benefit by receiving 
optimized information. 
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Likewise, a recipient may place a value on receiving certain information, which forms the 
basis for "pay-per-view" systems. In this case, the recipient's values may also be considered in 
defining the programming. 

This optimization is achieved by providing a device local to the recipient which 
5 selectively presents commercial information to the recipient based on characteristics individual 
to the recipient, which may be input by the recipient, the publisher, the advertiser, and/or learned 
by the system based on explicit or implicit feedback. The local device either has a local memory 
for advertising materials, or a telereception link for receiving commercial information for 
presentation, either on a real time basis or stored for later presentation. In a further embodiment, 
10 a user may control the content and/or commercial information received. In this case, the 

accounting system involves the user's account, and, for example, the recipient may be denied the 
subsidy from the commercial advertiser, and pay for the privilege of commercial free content. 

It is also possible to employ the methods and systems according to the present invention 
to create a customized publication, which may be delivered physically to the recipient, for 
15 example as print media, facsimile transmission, e-mail, R-CD-ROM, floppy disk, or the like, 
without having a device local to the consumer. 

It is noted that this system and method is usable for both real time media, such as 
television, radio and on-line telecommunication, as well as manually distributed periodicals, such 
as newspapers, magazines, CD-ROMs, diskettes, etc. Therefore, the system and method 
20 according to the present invention includes a set of related systems with varying details of 
implementation, with the underlying characteristic of optimization of variable material 
presentation at the recipient level rather than the publisher level. 

The system and method according to the present invention preferably includes an 
accounting system which communicates information relating to receipt of commercial 
25 advertising information by a recipient to a central system for determination of actual receipt of 
information. This feedback system allows verification of receipt and reduces the possibility of 
fraud or demographic inaccuracies. 

The accounting system, for example, may place value on the timeslot. associated content, 
the demographics of the user, user's associated valuation, competition for placement, past history 
30 (number of impressions made to same recipient) and exclusivity. 
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A preferred embodiment includes a subscription television system having a plurality of 
received channels. At least one of these channels is associated with codes to allow determination 
of content from variable segments. It is also possible to identify these variable segments without 
these codes, although the preferred system includes use of such codes. These codes also allow 
5 simple identification of the content for accounting purposes. Upon detection of a variable 

segment, a commercial advertisement is selected for presentation to the recipient. This variable 
segment is selected based on the characteristics of the recipient(s), the history of use of the 
device by the recipient(s), the context of use, the arrangements made by the commercial 
information provider(s) for presentation of information, and the availability of information for 

10 presentation. Other factors may include the above-mentioned accounting system factors. 
Typically, the local device will include a store of commercial information, downloaded or 
otherwise transmitted to the recipient (e.g., a CD-ROM or DVD with MPEG-2 compressed 
images). A telecommunication link may also be provided to control the process, provide 
parameters for the presentation or the information itself. This telecommunication link may be 

15 provided through the public telephone network, Internet, private network (real or virtual) cable 
network, or a wireless network, for example. Generally, the underlying work will have a gap of 
fixed length, so that the commercial information must be selected to fit in this gap. Where the 
gap is of variable length, such as might occur in live coverage, the commercial information is 
interrupted or the underlying work buffered and delayed to prevent loss. Thus, the presentation 

20 to the user is constructed from pieces, typically at the time of presentation, and may include 
invariable content, variable content, invariable messages, variable messages, targeted content 
and/or messages, and hypervariable content. Hypervariable content includes, for example, 
transition material selected based on the stream of information present, and other presentations 
which my optionally include useful information which are individualized for the particular 

25 recipient or situation. 

According to another embodiment, a recording, such as on a videotape, is retained by a 
recipient which includes proprietary content. This may include a commercial broadcast, a private 
broadcast, or distributed media. In the case of a commercial broadcast, some or all of the 
commercial advertising or other time-sensitive information is old and/or stale. Therefore, in 

30 operation, this old or time sensitive information is eliminated and substituted with new and/or 
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different information. Thus, the presentation system freshens the presentation, editing and 
substituting where necessary. 

By such a method, content distributed even through private channels may include 
advertisements, and thus be subsidized by advertisers. The advertisements and other added 
content are generally more acceptable to the audience because they are appropriately targeted. 

For example, where the broadcaster has a high degree of control over the initial 
broadcast, e.g., pay per view under license, or where the broadcaster may claim substantial 
continuing rights in the work after recording, the enforcement of a proprietary replay system may 
be accepted. For example, a work is broadcast as an encrypted digital data stream, with selective 
decryption at the recipient's receiver, under license from the broadcaster. In this case, a 
recording system is provided which retains the encryption characteristics, ensuring the integrity 
of the accounting process. During presentation of the recorded work, commercial information is 
appropriately presented to the recipient during existing or created gaps, or in an associated output 
separate from the content presentation. The recipient, as a result, receives the benefit of the 
original subsidy, or may receive a new subsidy. 

Therefore, similar to the known DIVX system, an encrypted media may be mass 
distributed, which requires authorization for display. Instead, however, of requiring the recipient 
to pay for the initial and subsequent displays of the content, the player integrates advertising 
content into the output, which may vary based on the audience, time and past history, as well as 
other factors discussed herein. Given the interactive and variable nature of the presentation, the 
user or audience may even veto ("fast forward through") a particular commercial. In this case, 
the use may have to account for a fee, or other advertisers may tack up the slack. The veto 
provides information regarding the desires of the viewer, and may be used to help select future 
messages to the displayed or presented. 

According to another embodiment, a radio transmission/reception system is provided 
which broadcasts content, an overlay track and variable commercial information. The invariant 
works are preferably prerecorded music. The overlay track is preferably a "DJ", who provides 
information regarding the invariant works, commercial information or news. The commercial 
information in this instance therefore refers to prerecorded segments. In this instance, the goal is 
to allow the invariant works to be received by the recipient and presented with improved 
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optimization of the commercial information content and other messages presented at the time of 
output. Further, this system allows optimization of the presentation of the invariant portions as 
well, i.e., the commercial information and the program content may be independently selected at 
the receiver, with appropriate accounting for commercial subsidy. In a mobile receiver, it is 
preferable to include as a factor in the selection of commercial information a location of the 
receiver, as might be obtained from a GPS system, cellular location system, intelligent highway 
system or the like. This would allow geographically appropriate selection of commercial 
information, and possibly overlay information as well, e.g., traffic reports. 

Another embodiment according to the present invention provides a hypertext linked 
media or multimedia environment, such as HTML/World Wide Web, wherein information 
transmitted and/or displayed is adaptively selected based on the particular user or the user's 
receiving system. Thus, various elements may be dynamically substituted during use. 

Therefore, it is an object according to the present invention to provide adaptive man- 
machine interfaces, especially computer graphic user interfaces, which are economically 
improved to provide an optimized environment. Productivity of computer operators is limited by 
the time necessary to communicate a desired action through the user interface to the device. To 
reduce this limitation, most likely user actions are predicted and presented as easily available 
options. The technologies also extend beyond this core theme in many differing ways, 
depending on the particular application. 

The system also provides an intelligent, adaptive pattern recognition function in order to 
provide the operator with a small number of high probability choices, which may be complex, 
without the need for explicit definition of each atomic instruction comprising the desired action. 
The interface system predicts a desired action based on the user input, a past history of use, and a 
context of use. 

In yet another embodiment, a present mood of a user is determined, either explicitly or 
implicitly, and the device selects program material that assists in a desired mood transition. The 
operation of the device may additionally acquire data relating to an individual and the respective 
moods, desires and characteristics, altering the path provided to alter the mood based on the data 
relating to the individual. As stated above, in a group setting, a most acceptable path is presented 
rather than a most desirable path as presented for an individual. 
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In determining mood, a number of physiologic parameters may be detected. In a training 
circumstance, these set of parameters are correlated with a temporally associated preference. 
Thus, when a user inputs a preference into the system as feedback, mood data is also obtained. 
Invariant preferences may be separated, and analyzed globally, without regard for temporal 
variations, while varying preferences are linked with information regarding the surrounding 
circumstances and stored. For example, the preference data may be used to train a neural 
network, e.g., using backpropagation of errors or other known methods. The inputs to the neural 
network include available data about surrounding context, such as time, environmental 
brightness, and persons present; source program choices, which may be raw data, preprocessed 
data, and abstracted data; explicit user input; and, in this embodiment, mood parameters, which 
may be physiological or biometric data, voice pattern, or implicit inputs. An example of an 
implicit input is an observation of a man-machine interaction, such as a video game. The manner 
in which a person plays a video game or otherwise interacts with a machine may provide 
valuable data for determining a mood or preference. 

According to one embodiment of the invention, the image is preprocessed to decompose 
the image into object-elements, with various object-elements undergoing separate further 
processing. For example, certain backgrounds may be aesthetically modeled using simple fractal 
equations. While, in such circumstances the results may be inaccurate in an absolute sense, they 
may be adequate in a performance sense. Faces, on the other hand, have common and variable 
elements. Therefore, a facial model may be based on parameters having distinguishing power, 
such as width between eyes, mouth, shape of ears, and other proportions and dimensions. Thus, 
along with color and other data, a facial image may be stored as a reference to a facial model 
with the distinguishing parameters for reconstruction. Such a data processing scheme may 
produce a superior reconstructed image and allow for later recognition of the face, based on the 
stored parameters in reference to the model. Likewise, many different elements of an image may 
be extracted and processed in accordance with specific models to produce differentiating 
parameters, wherein the data is stored as a reference to the particular model along with the 
particular data set derived from the image. Such a processing scheme allows efficient image 
storage along with ease of object recognition, i.e., distinction between objects of the same class. 
This preprocessing provides a highly asymmetric scheme, with a far greater processing 
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complexity to initially process the image than to subsequently reconstruct or otherwise later 
employ the data. 

By employing a model-based object extraction system, the available bandwidth may be 
efficiently used, so that objects which fall within the scope of an available model may be 
identified with a model identification and a series of parameters, and objects not within the scope 
of a model may be allocated a comparatively greater bandwidth for general image description, 
e.g., JPEG, MPEG-l/MPEG-2, wavelet/standard fractal image compression (FIC), or other 
image processing schemes. In a worst case, therefore, the bandwidth required will be only 
slightly greater than that required for a corresponding standard method, due only to the additional 
overhead to define data types, as necessary. However, by employing a model based-object 
decomposition processing system, recognized elements may be described using only a small 
amount of data and a greater proportion of data used to describe unrecognized elements. Further, 
the models available may be dynamically updated, so that, as between a communicating 
transmitted and receiver, retransmission of unrecognized elements will be eliminated as a model 
is constructed. 

Where image processing systems may produce artifacts and errors, an error minimization 
function may also be provided which compares an original image with a decomposed- 
recomposed image and produces an error function which allows correction for these errors. This 
error function may be transmitted with the processed data to allow more faithful reproduction. In 
a pattern recognition context, the error function may provide useful data relating to the reliability 
of a pattern correlation, or may provide useful data outside of the model and associated 
parameters for pattern recognition. 

Thus, in the case of an object-extraction model-based processing system, the resulting 
data stream may be appropriate for both viewing and recognition. Of course, acoustic data may 
be likewise processed using acoustic models with variable parameters. However, in such a 
system, information for pattern recognition may be filtered, such as eliminating the error function 
or noise data. Further, certain types of objects may be ignored, for example, under normal 
circumstances, clouds in the sky provide little information for pattern recognition and may be 
removed. In such a system, data intended for viewing or listening will likely contain all objects 
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in the original data stream, with as much original detail as possible given data storage and 
bandwidth constraints. 

An object extraction model based processing system also allows for increased noise 
rejection, such as over terrestrial broadcast channels. By transmitting a model, the receiving 
system may interpolate or extrapolate data to fill in for missing data. By extrapolate, it is meant 
that past data is processed to predict a subsequent condition. By interpolate, it is meant that data 
presentation is delayed, and missing data may therefore be predicted from both past and 
subsequent data transmission. Missing portions of images may also be reconstructed from 
existing portions. This reconstruction process is similar to that described in U.S. Pat. No. 
5,247,363, to reconstruct MPEG images; except that where model data is corrupted, the 
corruption must be identified and the corrupt data eliminated and replaced with predicted data. 

It is therefore an object according to the present invention to provide a programmable 
control, having a status, responsive to an user input and a signal received from a signal source, 
comprising a controller, for receiving the user input and the signal and producing a control 
output; a memory for storing data relating to an activity of the user; a data processing system for 
adaptively predicting a most probable intended action of the user based on the stored data 
relating to the activity of the user and derived weighing of at least a subset of possible choices, 
the derivation being based on a history of use, a context of a respective choice and the status of 
the control; and a user feedback data presenting system comprising an output device for 
presentation of a variable sequence of programming options to the user, including the most 
probable intended action of the user, in a plurality of output messages, the output messages 
differing in available programming options. 

The programmable control may be employed for performing an action based on user 
input and an information content of a signal received from a signal source, wherein the output 
device includes a display device, further comprising a user controlled direct manipulation-type 
input device, associated with the display device, having a device output, the device output being 
the user input; a plant capable of performing the action, being responsive to an actuator signal; 
and the controller, being for receiving data from the device output of the input device and the 
signal, and displaying user feedback data on the display device, the logical sequence of the user 
feedback data including at least one sequence of options sufficient to define an operable control 
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program, and a presentation of additional programming options if the control program is not 
operable. 

The programmable control may further comprise a user input processing system for 
adaptively determining a viewer preference based on the user input received by the controller; a 
program material processing system for characterizing the program material based on its content; 
a correlator for correlating the characterized content of the program material with the determined 
viewer preference to produce a correlation index; and a processor, selectively processing the 
program material based on the correlation index, the data processing system receiving an input 
from the processor. 

The programmable control. may also comprise a plurality of stored profiles, a processor 
for characterizing the user input to produce a characterized user input; and means for comparing 
the characterized user input with at least one of the plurality of stored profiles to produce a 
comparison index, wherein the variable sequence of programming options is determined on the 
basis of the comparison index. The processor for characterizing may perform an algorithm on 
the signal comprising a transform selected from the group consisting of an Affine transformation, 
a Fourier transformation, a discrete cosine transformation and a wavelet transformation. 

It is a further object according to the present invention to provide a programmable 
controller for controlling a recording device for recording an analog signal sequentially on a 
recording medium having a plurality of uniquely identifiable storage locations, further 
comprising a sequential recording device for recording the analog signal, and a memory for 
storing, in a directory location on the recording medium which is separate from the storage 
location of the analog signal, information relating to the signal, processed to selectively retain 
characterizing information, and an identifier of a storage location on the recording medium in 
which the analog signal is recorded. 

It is another object according to the present invention to provide a control, wherein 
program material is encrypted, further comprising a decryption system for decrypting the 
program material if it is selected to produce unencrypted program material and optionally an 
associated decryption event; a memory for storing data relating to the occurrence of the 
decryption event; and a central database for storing data relating to the occurrence of the 
decryption event in association with data relating to the viewer. 
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It is still another object according to the present invention to provide a control wherein 
the user input processing system monitors a pattern of user activity and predicts a viewer 
preference; the program material processing system comprising a processor for preprocessing the 
program material to produce a reduced data flow information signal substantially retaining 
5 information relating to the abstract information content of the program material and selectively 
eliminating data not relating to the abstract information content of the program material and for 
characterizing the information signal based on the abstract information content; and a comparing 
system for determining if the correlation index is indicative of a probable high correlation 
between the characterization of the information signal and the viewer preference and causing the 

10 stored program material to be processed by the processing means based on the determination. 
The system according to this aspect of the present invention preferably comprises an image 
program material storage and retrieval system. 

The present invention further provides a control further comprising a memory for storing 
a characterization of the program material; an input for receiving a feedback signal from the 

15 viewer indicating a degree of agreement with the correlation index determination, wherein the 
feedback signal and the stored characterization are used by the viewer preference predicting 
means to predict a new viewer preference. 

According to another aspect of the invention, it is an object to provide an image 
information retrieval apparatus, comprising a memory for storing compressed data representing a 

20 plurality of images; a data storage system for retrieving compressed data representing at least one 
of the plurality of images and having an output; a memory for storing characterization data 
representing a plurality of image types, having an output; and an image processor, receiving as 
inputs the outputs from the data storage system and the characterization data memory, and 
producing a signal corresponding to a relation between at least one of the plurality of images of 

25 the compressed data and at least one of the image types of the characterization data. 

It is a still further aspect of the present invention to provide a video interface device for a 
user comprising a data transmission system for simultaneously transmitting data representing a 
plurality of programs; a selector for selecting at least one of the plurality of programs, being 
responsive to an input; a program database containing information relating to the plurality of 

30 programs, having an output; a graphical user interface for defining commands, comprising (a) an 
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image display device having at least two dimensions of display, being for providing visual image 
feedback; and (b) a multidimensional input device having at least two dimensions of operability, 
adapted to correspond to the two dimensions of the display device, and having an output, so that 
the user may cause the input device to produce a corresponding change in an image of the display 
device by translating an indicator segment of the display in the at least two dimensions of 
display, based on the visual feedback received from the display device, the indicator segment 
being moved to a translated location of the display device corresponding to a user command; and 
a controller for controlling the graphical user interface and for producing the input of the 
selector, receiving as a control the output of the multidimensional input device, the controller 
receiving the output of the program database and presenting information relating to at least one 
of the plurality of programs on the display device associated with a command, the command 
being interpreted by the control means as the user command to produce the input of the selector 
to select the at least one of the plurality of programs associated with the command. 

Another object of the present invention is to provide an apparatus, receiving as an input 
from a human user having a user characteristic, comprising an input device, producing an input 
signal from the human user input; a display for displaying information relating to the input from 
the user and feedback on a current state of the apparatus, having an alterable image type; an input 
processor for extracting an input instruction relating to a desired change in a state of the 
apparatus from the input signal; a detector for detecting one or more temporal-spatial user 
characteristics of the input signal, independent of the input instruction, selected from the group 
consisting of a velocity component, an efficiency of input, an accuracy of input, an interruption 
of input and a high frequency component of input; a memory for storing data related to the user 
characteristics; and a controller for altering the image type based on the user characteristics. The 
controller may alter the image type based on an output of the detector and the stored data so that 
the display displays an image type which corresponds to the detected user characteristics. The 
controller may further be for controlling the causation of an action on the occurrence of an event, 
further comprising a control for receiving the input instruction and storing a program instruction 
associated with the input instruction, the control having a memory sufficient for storing program 
instructions to perform an action on the occurrence of an event; and a monitor for monitoring an 
environment of the apparatus to determine the occurrence of the event, and causing the 



Hoffberg et al. 



-90- 



LIH-10 



performance of the action on the occurrence of the event. The controller may also alters the 
image type based on an output of the detector and the stored data so that the display means 
displays an image type which corresponds to the detected user characteristics. 

It is another object of the present invention to provide an adaptive programmable 
5 apparatus having a plurality of states, being programmable by a programmer and operating in an 
environment in which a plurality of possible events occur, each of the events being associated 
with different data, comprising an data input for receiving data; an programmer input, producing 
an input signal from the programmer; a memory for storing data relating to the data input or the 
input signal; a feedback device for adaptively providing information relating to the input signal 

10 and a current status of the apparatus to the programmer, based on the data input or the 

programmer input, the stored data, and derived weighing of at least a subset of possible choices, 
the derived weighing being based on a history of use, a context of a respective choice and the 
current status of the apparatus; a memory for storing programming data associated with the input 
signal; and a processor, having a control output, for controlling the response of the apparatus 

15 relating to the detection of the input signal or the data in accordance with the stored 

programming data, the processor: (a) processing the at least one of the input signal or the data to 
reduce an amount of information while substantially retaining an abstract portion of the 
information; (b) storing a quantity of the abstracted information: (c) processing the abstract 
portion of the information in conjunction with the stored quantity of abstracted information; and 

20 (d) providing the control output based .on the processed abstract portion of the information and 
the stored programming data. The apparatus may further comprise an input for receiving a 
programming preference from the programmer indicating a plurality of possible desired events; 
the processor further including a correlator for correlating the programming preference with the 
data based on an adaptive algorithm and for determining a likelihood of occurrence of at least 

25 one of the desired events, producing, the control output. The apparatus may further comprise an 
input for receiving feedback from the programmer indicating a concurrence with the control 
output of the processor, and modifying the response control based on the received feedback to 
increase a likelihood of concurrence. The apparatus may still further verify the programming 
data to ensure that the programming data comprise a complete and consistent set of instructions; 

30 and include a feedback system for interactively modifying the programming data. The apparatus 
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may also comprise a chronological database and an accessing system for accessing the 
chronological database on the basis of the programming data stored in the memory. 

It is also an object according to the present invention to provide an apparatus comprising 
an input for receiving a programming preference from the programmer indicating a plurality of 
5 possible desired events; and a correlator for correlating the programming preference with the data 
based on an adaptive algorithm and for determining a likelihood of occurrence of at least one of 
the desired events, producing the output, the output being associated with the initiation of the the 
response. 

The present invention also provides as an object an apparatus comprising an input for 

10 receiving feedback from the programmer indicating a concurrence with the output of the 

correlator, and modifying the algorithm based on the received feedback, the feedback device 
comprising a display and the input device is remote from the display, and providing a direct 
manipulation of display information of the display. 

According to an aspect of the present invention, a processor of the programmable 

15 apparatus verifies the program instructions to ensure that the program instructions are valid and 
executable by the processor; an output for providing an option, selectable by the programmer 
input for changing an instruction stored by the processor, such that the apparatus enters a state 
wherein a new instruction may be input to substitute for the instruction, wherein the processor 
verifies the instructions such that the instructions are valid; and wherein the feedback device 

20 further presents information requesting confirmation from the programmer of the instructions 
associated with the input signal. The apparatus may further comprise a chronological database 
and an accessing system for accessing the chronological database on the basis of the program 
instructions stored in the memory. 

The processor of the programmable apparatus may receive information from the input 

25 signal and/or from the data input; and may further comprise an input signal memory for storing 
at least a portion of the input signal or the data, a profile generator for selectively generating a 
profile of the input signal or the data, and an input signal profile memory for storing the profile 
of the input signal or the data separately from the input signal or the data in the input signal 
memory. The programmable apparatus may further comprise a processor for comparing the 

30 input signal or the data with the stored profile of the input signal or the data to determine the 
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occurrence of an event, and the data optionally comprises image data and the processor for 
comparing performs image analysis. The image data may comprise data having three associated 
dimensions obtained by a method selected from the group consisting of synthesizing a three 
dimensional representation based on a machine based model derived from two dimensional 
image data, synthesizing a three dimensional representation derived from a time series of pixel 
images, and synthesizing a three dimensional representation based on a image data representing a 
plurality of parallax views each having at least two dimensions. 

A user feedback data presenting device according to the present invention may comprise 
a display having a plurality of display images, the display images differing in available 
programming options. 

According to another aspect of the present invention, a program material processing 
system is provided comprising means for storing template data; means for storing the image data; 
means for generating a plurality of domains from the stored image data, each of the domains 
representing different portions of the image information; means for creating, from the stored 
image data, a plurality of addressable mapped ranges corresponding to different subsets of the 
stored image data, the creating means including means for executing, for each of the mapped 
ranges, a procedure upon the one of the subsets of the stored image data which corresponds to the 
mapped range; means for assigning identifiers to corresponding ones of the mapped ranges, each 
of the identifiers specifying for the corresponding mapped range an address of the corresponding 
subset of stored image data; means for selecting, for each of the domains, the one of the mapped 
ranges which most closely corresponds according to predetermined criteria; means for 
representing at least a portion of the image information as a set of the identifiers of the selected 
mapped ranges; and means for selecting, from the stored templates, a template which most 
closely corresponds to the set of identifiers representing the image information. The means for 
selecting may comprise means for selecting, for each domain, the mapped range which is the 
most similar, by a method selected from at least one of the group consisting of selecting a 
minimum Hausdorff distance from the domain, selecting the highest cross-correlation with the 
domain and selecting the lowest mean square error of the difference between the mapped range 
and the domain. The means for selecting may also comprise, for each domain, the mapped range 
with the minimum modified Hausdorff distance calculated as Dfdb.mrb] + D[l - db,l - mrb], 
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where D is a distance calculated between a pair of sets of data each representative of an image, 
db is a domain, mrb is a mapped range, 1 - db is the inverse of a domain, and 1-mrb is an inverse 
of a mapped range. The means for representing may further comprise means for determining a 
feature of interest of the image data, selecting a mapped range corresponding to the feature of 
interest, storing the identifiers of the selected mapped range, selecting a further mapped range 
corresponding to a portion of image data having a predetermined relationship to the feature of 
interest and storing the identifiers of the further mapped range. 

According to an embodiment of the present invention, the image data comprises data 
having three associated dimensions obtained by a method selected from the group consisting of 
synthesizing a three dimensional representation based on a machine based prediction derived 
from two dimensional image data, synthesizing a three dimensional representation derived from 
a time series of pixel images, and synthesizing a three dimensional representation based on a 
image data representing a plurality of parallax views having at least two dimensions. 

It is therefore an object of the present invention to provide a programmable apparatus for 
receiving instructions from a programmer and causing an action to occur on the happening of an 
event, comprising an input device, producing an input instruction signal; a control means for 
receiving the input instruction signal, and storing a program instruction associated with the input 
instruction signal, the control means storing sufficient program instructions to perform an action 
on the occurrence of an event, the control means monitoring a status of the apparatus to 
determine the occurrence of various events, comparing the determined events with the program 
instructions, and performing the action on the occurrence of the event; a display means for 
interactively displaying information related to the instructions to be received, and responsive 
thereto, controlled by the control means, so that the programmer is presented with feedback on a 
current state of the apparatus and the program instruction; wherein the control means further 
comprises means for detecting one or more characteristics of the input instruction signal 
independent of the program instruction selected from the group consisting of a velocity 
component, an efficiency of input, an accuracy of input, an interruption of input, a high 
frequency component of input and a past history of input by the programmer, whereby when the 
control means detects a characteristic indicating that the display means is displaying information 
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in a suboptimal fashion, the control means controls the display means to display information in a 
more optimal fashion. 

It is also an object of the present invention to provide a programmable apparatus for 
receiving instructions from a programmer and causing an action to occur on the happening of an 
5 event, comprising an input device, producing an input instruction signal; a control means for 
receiving the input instruction signal, and storing a program instruction associated with the input 
instruction signal, the control means storing sufficient program instructions to perform an action 
on the occurrence of an event, the control means monitoring a status of the apparatus to 
determine the occurrence of various events, comparing the determined events with the program 

10 instructions, and performing the action on the occurrence of the event; a display means for 
interactively displaying information related to the instructions to be received, and responsive 
thereto, controlled by the control means, so that the programmer is presented with feedback on a 
current state of the apparatus and the program instruction; wherein the control means further 
comprises means for detecting a need by the programmer for more detailed information 

15 displayed on the display means, by detecting one or more characteristics of the input instruction 
signal independent of the program instruction selected from the group consisting of a velocity 
component, an efficiency of input, an accuracy of input, an interruption of input, a high 
frequency component of input and a past history of input by the programmer, whereby when the 
control means detects a characteristic indicating that the display means is insufficiently detailed 

20 information, the control means controls the display means to display more detailed information. 

It is a further object of the present invention to provide a programmable apparatus having 
a data input, the apparatus receiving instructions from a programmer and causing an action to 
occur on the receipt of data indicating an event, comprising an input device, producing an input 
instruction signal; a control means for receiving the input instruction signal, and storing a 

25 program instruction associated with the input instruction signal, the control means storing 

sufficient program instructions to perform an action on the receipt of data indicating an event, the 
control means monitoring the data input; a display- means for interactively displaying information 
related to the instructions to be received, and responsive thereto, controlled by the control means, 
so that the programmer is presented with feedback on a current state of the apparatus and the 

30 program instruction; wherein the control means receives a programming preference indicating a 



Hoffberg et al. 



-95- 



LIH-10 



desired event from the input device which does not unambiguously define the event, and the 
control means monitors the data and causes the occurrence of the action when a correlation 
between the programming preference and the monitored data is above a predetermined threshold, 
indicating a likely occurrence of the desired event. It is also object of the present invention to 
provide the programmable aforementioned apparatus, wherein the input device is remote from 
the display means, and provides a direct manipulation of display information of the display 
means, further comprising means for verifying the program instructions so that the program 
instructions are executable by the control means. The control means may further comprise a 
calendar or other chronological database. 

Another object of the present invention provides a programmable information storage 
apparatus having a data input, for receiving data to be stored, the apparatus receiving instructions 
from a programmer and causing an action to occur on the receipt of data indicating an event, 
comprising means for storing data from the data input; an input device, producing an input 
instruction signal; a control means for receiving the input instruction signal, and storing a t 
program instruction associated with the input instruction signal, the control means storing 
sufficient program instructions to perform an action on the receipt of data from the data input 
indicating an event, the control means monitoring the data input to determine the occurrence of 
various events, comparing the determined events with the program instructions, and performing 
for storing the data the action on the occurrence of the event; wherein the control means receives 
identifying data from at least one of the input device and the data input, the identifying data 
being stored separately from the input data on a storage medium. The programmable 
information storage apparatus may also include means for reading the identifying data stored 
separately on the storage medium, and may also receive as an input the identifying data. 

It is also an object of the present invention to provide a programmable apparatus, wherein 
the control means provides an option, selectable by the input means in conjunction with the 
display means, for changing an input program instruction prior to execution by the control 
means, so that the apparatus enters a state wherein a new program instruction may be input to 
substitute for the changed input step, wherein the control means verifies the program instructions 
so that the program instructions are executable by the control means. 
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It is still another object of the present invention to provide a programmable apparatus, 
wherein the control means further causes the display means to display a confirmation screen after 
the program instructions are input, so that the programmer may confirm the program instructions. 

Another object of the present invention is to provide a programmable information storage 
5 apparatus, wherein the control means further comprises means for recognizing character data 
present in a data stream of the input data, the identifying data comprising the recognized 
character data. 

It is a still further object of the present invention to provide a video tape recording 
apparatus, comprising a video signal receiving device, a recording device for recording the video 

10 signal, wherein the control analyzes the video signal for the presence of a symbol, and recognizes 
the symbol as one of a group of recognized symbols, and the control stores the recognized 
symbol separately from the video signal. 

Another object of the present invention is to provide a recording device for recording an 
analog signal sequentially on a recording medium, comprising means for characterizing the 

15 analog signal, wherein data representing the characterization and a location of the analog signal 
on the recording medium are stored in a directory location on the recording medium separately 
from the analog signal. 

It is a further object of the present invention to provide an interface for a programmable 
control for input of a program for a controller to execute, which performs an action based on an 

20 external signal, comprising an input device, a controller for receiving data from the input device 
and from an. external stimulus, a plant being controlled by the controller based on an input from 
the input device and the external stimulus, and a display device being controlled by the 
controller, for providing visual feedback to a user operating the input device, wherein a 
predetermined logical sequence of programming options is presented to the user on the display 

25 device, in a plurality of display screens, each of the display screens differing in available 

programming choices; the logical sequence including a correct sequence of choices to set an 
operable control program, so that no necessary steps are omitted; the external stimulus comprises 
a timing device, and the display comprises a display option for programming the plant to perform 
an action at a time which is input through the input device as a relative position on the display 

30 device, the relative position including a means for displaying an absolute time entry and means 
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for displaying a relative time entry, the display also comprising a display option means for 
performing an action at a time; the control comprises means for presenting the user, on the 
display device, with a most probable action, which may be selected by the user through 
activation of the input device without entering data into the controller through the input device 
relating to both the action and the event; the display also comprising means for indicating 
completion of entry of a programming step, which means indicates to the user an indication that 
the programming step is not completed if information necessary for execution of the step is not 
available to the controller; and the controller being capable of controlling the display device to 
present information to the user relating to the use of the apparatus if necessary for use of the 
device by the user. 

Another object of the present invention provides a system for presenting a program to a 
viewer, comprising a source of program material; means for determining a viewer preference, the 
viewer preference optionally being context sensitive; means for receiving the program material 
from the source; means for characterizing the program material based on its content; means for 
correlating the characterized content of the program material with the determined viewer 
preference to produce a correlation index; and means for presenting the program material to the 
viewer, if the correlation index indicates a probable high correlation between the characterization 
of the program material and the viewer preference. 

Another object of the present invention is to provide a system for presenting a program to 
a viewer, comprising a source of program material; means for determining a viewer preference; 
means for receiving the program material from the source; means for storing the program 
material; means for preprocessing the program material to produce a reduced data flow 
information signal retaining information relating to a character of the program material and 
eliminating data not necessary to characterize the program material; means for characterizing the 
information signal based on its content; means for correlating the characterized content of the 
information signal with the determined viewer preference to produce a correlation index; and 
means for presenting the stored program material to the viewer, if the correlation index indicates 
a probable high correlation between the characterization of the information signal and the viewer 
preference. The system may also include a means for storing the information signal, wherein the 
characterizing means characterizes the stored information signal, and also a memory for storing 
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the program material while the characterizing means produces characterized content and the 
correlating means produces the correlation index. 

Still another object of the present invention is to provide a system, wherein the program 
material is encrypted, further comprising means for decrypting the program material to produce a 
5 decryption event; and means for charging an account of the viewer based on the occurrence of a 
decryption event. Thus, a decryption processor and an accounting database are provided for 
these purposes. 

Another object of the present invention is to allow the means for characterizing the 
program material to operate without causing a decryption event. Thus, the data stream may 

10 include characterization data specifically suitable for processing by a characterizing system, or 
the decryption processor may be provided with multiple levels of functionality, or both. Further, 
the system may comprise a memory for storing the program material while the characterizing 
means produces characterized content and the correlating means produces the correlation index. 
The characterizing means may also characterize the program material stored in memory, and the 

15 program material stored in memory may be compressed. 

Another object of the present invention is to provide a controller for controlling a plant, 
having a sensor for sensing an external event and producing a sensor signal, an actuator, 
responsive to an actuator signal, for influencing the external event, and a control means for 
receiving the sensor signal and producing an actuator signal, comprising means for inputting a 

20 program; means for storing the program; means for characterizing the sensor signal to produce a 
characterized signal; and means for comparing the characterized signal with a pattern stored in a 
memory to produce a comparison index, wherein the actuator signal is produced on the basis of 
the comparison index and the program, wherein the characterization comprises an Affine 
transformation of the sensor signal. The characterization may comprise one or more 

25 transformation selected from the group consisting of an Affine transformation, a Fourier 
transformation, a Gabor transformation, and a wavelet transformation. 

It is another object of the present invention to provide a method for automatically 
recognizing digital image data consisting of image information, the method comprising the steps 
performed by a data processor of storing a plurality of templates; storing the image data in the 

30 data processor; generating a plurality of addressable domains from the stored image data, each of 
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the domains representing a portion of the image information; creating, from the stored image 
data, a plurality of addressable mapped ranges corresponding to different subsets of the stored 
image data, the creating step including the substep of (a) executing, for each of the mapped 
ranges, a corresponding procedure upon the one of the subsets of the stored image data which 
corresponds to the mapped ranges; (b) assigning identifiers to corresponding ones of the mapped 
ranges, each of the identifiers specifying for the corresponding mapped range a procedure and a 
address of the corresponding subset of the stored image data; (c) optionally subjecting a domain 
to a transform selected from the group consisting of a predetermined rotation, an inversion, a 
predetermined scaling, and a predetermined preprocessing in the time, frequency, and/or wavelet 
domain; (d) selecting, for each of the domains or transformed domains, the one of the mapped 
ranges which most closely corresponds according to predetermined criteria; (e) representing the 
image information as a set of the identifiers of the selected mapped ranges; and (f) selecting, 
from the stored templates, a template which most closely corresponds to the set of identifiers 
representing the image information. The step of selecting the mapped ranges may also include 
the substep of selecting, for each domain, a most closely corresponding one of the mapped 
ranges. 

It is another object of the present invention to provide a method wherein the step of 
selecting the most closely corresponding one of the mapped ranges includes the step of selecting, 
for each domain, the mapped range which is the most similar, by a method selected from one or 
more of the group consisting of selecting minimum Hausdorff distance from the domain, 
selecting the highest cross-correlation with the domain, selecting the highest fuzzy correlation 
with the domain and selecting the minimum mean square error with the domain. 

Another object of the present invention provides a method wherein the step of selecting 
the most closely corresponding one of mapped ranges includes the step of selecting, for each 
domain, the mapped range with the minimum modified Hausdorff distance calculated as 
D[db,mrb] + D[l - db,l - mrb], where D is a distance calculated between a pair of sets of data 
each representative of an image, db is a domain, mrb is a mapped range, 1 - db is the inverse of a 
domain, and 1-mrb is an inverse of a mapped range. 

Another object of the present invention provides a method wherein the digital image data 
consists of a plurality of pixels each having one of a plurality of associated color map values, 
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further comprising the steps of optionally transforming the color map values of the pixels of each 
domain by a function including at least one scaling function for each axis of the color map, each 
of which may be the same or different, and selected to maximize the correspondence between the 
domains and ranges to which they are to be matched; selecting, for each of the domains, the one 
5 of the mapped ranges having color map pixel values which most closely correspond to the color 
map pixel values of the domain according to a predetermined criteria, wherein the step of 
representing the image color map information includes the substep of representing the image 
color map information as a set of values each including an identifier of the selected mapped 
range and the scaling functions; and selecting a most closely corresponding stored template, 

10 based on the identifier of the color map mapped range, the scaling functions and the set of 

identifiers representing the image information. The first criteria may comprise minimizing the 
Hausdorff distance between each domain and the selected range. 

Another object of the present invention is to provide a method further comprising the 
steps of storing delayed image data, which represents an image of a moving object differing in 

15 time from the image data in the data processor; generating a plurality of addressable further 

domains from the stored delayed image data, each of the further domains representing a portion 
of the delayed image information, and corresponding to a domain; creating, from the stored 
delayed image data, a plurality of addressable mapped ranges corresponding to different subsets 
of the stored delayed image data; matching the further domain and the domain by subjecting a 

20 further domain to one or both of a corresponding transform selected from the group consisting of 
a null transform, a rotation, an inversion, a scaling, a translation and a frequency domain 
preprocessing, which corresponds to a transform applied to a corresponding domain, and a 
noncorresponding transform selected from the group consisting of a rotation, an inversion, a 
scaling, a translation and a frequency domain preprocessing, which does not correspond to a 

25 transform applied to a corresponding domain; computing a motion vector between one of the 
domain and the further domain, or the set of identifiers representing the image information and 
the set of identifiers representing the delayed image information, and storing the motion vector; 
compensating the further domain with the motion vector and computing a difference between the 
compensated further domain and the domain; selecting, for each of the delayed domains, the one 

30 of the mapped ranges which most closely corresponds according to predetermined criteria; 
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representing the difference between the compensated further domain and the domain as a set of 
difference identifiers of a set of selected mapping ranges and an associated motion vector and 
representing the further domain as a set of identifiers of the selected mapping ranges; 
determining a complexity of the difference based on a density of representation; and when the 
difference has a complexity below a predetermined threshold, selecting, from the stored 
templates, a template which most closely corresponds to the set of identifiers of the image data 
and the set of identifiers of the delayed image data. 

Another object of the present invention provides an apparatus for automatically 
recognizing digital image data consisting of image information, comprising means for storing 
template data; means for storing the image data; means for generating a plurality of addressable 
domains from the stored image data, each of the domains representing a different portion of the 
image information; means for creating, from the stored image data, a plurality of addressable 
mapped ranges corresponding to different subsets of the stored image data, the creating means 
including means for executing, for each of the mapped ranges, a procedure upon the one of the 
subsets of the stored image data which corresponds to the mapped range; means for assigning 
identifiers to corresponding ones of the mapped ranges, each of the identifiers specifying for the 
corresponding mapped range an address of the corresponding subset of stored image data; . means 
for selecting, for each of the domains, the one of the mapped ranges which most closely 
corresponds according to predetermined criteria; means for representing the image information as 
a set of the identifiers of the selected mapped ranges; and means for selecting, from the stored 
templates, a template which most closely corresponds to the set of identifiers representing the 
image information. 

It is also an object of the present invention to provide a method and system for processing 
broadcast material having a first portion and a second portion, wherein the first portion 
comprises an content segment and the second portion comprises a commercial segment, in order 
to allow alteration in the presentation of commercial segments, based on the recipient, 
commercial sponsor, and content provider, while providing means for accounting for the entire 
broadcast. 

Another object of an embodiment of the present invention provides an apparatus 
comprising a user interface, receiving a control input and a user attribute from the user; a 
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memory system, storing the control input and user attribute; an input for receiving content data; 
means for storing data describing elements of the content data; means for presenting information 
to the user relating to the content data, the information being for assisting the user in defining a 
control input, the information being based on the stored user attribute and the data describing 
5 elements of the content data; and means for processing elements of the content data in 

dependence on the control input, having an output. This apparatus according to this embodiment 
may be further defined as a terminal used by users of a television program delivery system for 
suggesting programs to users, wherein the user interface comprises means for gathering the user 
specific data to be used in selecting programs; the memory system comprises means, connected 

10 to the gathering means, for storing the user specific data; the input for receiving data describing 
elements of the content data comprises means for receiving the program control information 
containing the program description data; and the processing means comprises program selection 
means, operably connected to the storing means and the receiving means, for selecting one or 
more programs using a user's programming preferences and the program control information. In 

15 this case, the program selection means may comprise a processor, wherein the user programming 
preferences are generated from the user specific data; and means, operably connected to the 
program selection means, for suggesting the selected programs to the user. The apparatus 
processing means selectively may records the content data based on the output of the processing 
means. Further, the presenting means presents information to the user in a menu format. The 

20 presenting means may comprises means for matching the user attribute to content data. 

The data describing elements of an associated data stream may, for example, comprise a 
program guide generated remotely from the apparatus and transmitted in electronically accessible 
form; data defined by a human input, and/or data defined by an automated analysis of the content 
data. 

25 Acording to another embodiment, the present invention comprises a method, comprising 

the steps of receiving data describing an user attribute; receiving a content data stream, and 
extracting from the content data stream information describing a plurality of program options; 
and processing the data describing a user attribute and the information describing a plurality of 
program options to determine a likely user preference: selectively processing a program option 

30 based on the likely user preference. The method may be embodied in a terminal for a television 
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program delivery system for suggesting programs to users for display on a television using 
program control information and user specific data. In that case, the step of receiving data 
describing an user attribute may comprise gathering user specific data to be used in selecting 
programs, and storing the gathered user specific data; the step of receiving a content data stream, 
5 may comprise receiving both programs and program control information for selecting programs 
as the information describing a plurality of program options; the selectively processing step may 
comprise selecting one or more programs using a user's programming preferences and the 
received program control information, wherein the user programming preferences are generated 
from the user specific data; and the method further including the step of presenting the program 

10 or information describing a program option for the selected programs to the user. 

The user attribute may comprise a semantic description of a preference, or some other 
type of description, for example a personal profile, a mood, a genre, an image represnting or 
relating to a scene, a demographic profile, a past history of use by the user, a preference against 
certain types of media, or the like. In the case of a semantic preference, the data processing step 

15 may comprise determining a semantic relationship of the user preference to the information 

describing a plurality of program options. The program options may, for example, be transmitted 
as an electronic program guide, the information being in-band with the content (being transmitted 
on the same channel), on a separate channel or otherwise out of band, through a separate 
communications network, e.g., the Internet, dial-up network, or other streaming or packet based 

20 communications system, or by physical transfer of a computer-readable storage medium, such as 
a CD-ROM or floppy disk. The electronic program guide may include not only semantic or 
human-readable information, but also other types of metadata relating to or describing the 
program content. 

In a further embodiment of the present invention, it is an object to provide a device for 
25 identifying a program in response to user preference data and program control information 

concerning available programs, comprising means for gathering the user preference data; means, 
connected to the gathering means, for storing the gathered user preference data; means for 
accessing the program control information; and means, connected to the storing means and 
accessing means, for identifying one or more programs based on a correspondence between a 
30 user's programming preferences and the program control information. For example, theyy9Olt0b0slO.O0vlP 
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identifying means identifies a plurality of programs, a sequence of identifications transmitted to 
the user being based on a degree of correspondence between a user's programming preferences 
and the respective program control information of the identified program. The device my 
selectively record or display the program, or identify the program for the user, who may then 
5 define the appropriate action by the device. Therefore, a user may, instead of defining "like" 
preferences, may define "dislike" preference, which are then used to avoid or filter certain 
content. Thus, this feature may be used for censoring or parental screening, or merely to avoid 
unwanted content. Thus, the device comprises a user interface adapted to allow interaction 
between the user and the device for response to one or more of the identified programs. The 

10 device also preferably comprises means for gathering the user specific data comprises means for 
monitoring a response of the user to identified programs. 

It is a further object of the invention to provide a device which serves as a set top terminal 
used by users of a television program delivery system for suggesting programs to users using 
program control information containing scheduled program description data, wherein the means 

15 for gathering the user preference data comprising means for gathering program watched data; the 
means, connected to the gathering means, for storing the gathered user preference data 
comprising means, connected to the gathering means, for storing the program watched data; the 
means for accessing the program control information comprising means for receiving the 
program control information comprising the scheduled program description data; the means, 

20 connected to the storing means and accessing means, for identifying one or more programs based 
on a correspondence between a user's programming preferences and the program control 
information, being for selecting at least one program for suggestion to the viewer, comprising: 
means for transforming the program watched data into preferred program indicators, wherein a 
program indicator comprises a program category with each program category having a weighted 

25 value; means for comparing the preferred program indicators with the scheduled program 

description data, wherein each scheduled program is assigned a weighted value based on at least 
one associated program category; means for prioritizing the scheduled programs from highest 
weighted value programs to lowest weighted value programs; means for indicating one or more 
programs meeting a predetermined weight threshold, wherein all other programs are excluded 
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from program suggestion; and means, operably connected to the program selection means, for 
displaying for suggestion the selected programs to the user. 

It is a further aspect of the invention to provide device a device comprising: a data 
selector, for selecting a program from a data stream; an encoder, for encoding programs in a 
digitally compressed format; a mass storage system, for storing and retrieving encoded programs; 
a decoder, for decompressing the retrieved encoded programs; and an output, for outputting the 
decompressed programs. 

Therefore, the present invention provides a system and method for making use of the 
available broadcast media forms for improving an efficiency of matching commercial 
information to the desires and interests of a recipient, improving a cost effectiveness for 
advertisers, improving a perceived quality of commercial information received by recipients and 
increasing profits and reducing required information transmittal by publishers and media 
distribution entities. 

This improved advertising efficiency is accomplished by providing a system for collating 
a constant or underlying published content work with a varying, demographically or otherwise 
optimized commercial information content. This commercial information content therefore need 
not be predetermined or even known to the publisher of the underlying works, and in fact may be 
determined on an individual receiver basis. It is also possible to integrate the demographically 
optimized information within the content. For example, overlays in traditional media, and 
electronic substitutions or edits in new media, may allows seamless integration. The content 
alteration need not be only based on commercial information, and therefore the content may vary 
based on the user or recipient. 

The technologies emphasize adaptive pattern recognition of both the user input and data, 
with possible use of advanced signal processing and neural networks. These systems may be 
shared between the interface and operational systems, and therefore a controller for a complex 
system may make use of the intrinsic processing power available, rather than requiring additional 
computing resources, although this unification is not required. In fact, while hardware efficiency 
dictates that near term commercial embodiments employ common hardware for the interface 
system and the operational system, future designs may successfully separate the interface system 
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from the operational system, allowing portability and efficient application of a single interface 
system for a number of operational systems. 

The adaptive nature of the technologies derive from an understanding that people learn 
most efficiently through the interactive experiences of doing, thinking, and knowing. Users 
5 change in both efficiency and strategy over time. To promote ease-of-use, efficiency, and lack of 
frustration of the user, the interface of the device is intuitive and self explanatory, providing 
perceptual feedback to assist the operator in communicating with the interface, which in turn 
allows the operational system to identify of a desired operation. Another important aspect of 
man-machine interaction is that there is a learning curve, which dictates that devices which are 

10 especially easy to master become frustratingly elemental after continued use, while devices 

which have complex functionality with many options are difficult to master and may be initially 
rejected, or used only at the simplest levels. The present technologies address these issues by 
determining the most likely instructions of the operator, and presenting these as easily available 
choices, by analyzing the past history data and by detecting the "sophistication" of the user in 

15 performing a function, based on all information available to it. The context of use is also a factor 
in many systems. The interface seeks to optimize the interface adaptively and immediatelrin 
order to balance and optimize both quantitative and qualitative factors. This functionality may 
greatly enhance the quality of interaction between man and machine, allowing a higher degree of 
overall system sophistication to be tolerated. 

20 The interface system analyzes data from the user, which may be both the selections made 

by the user in context, as well as the efficiency by which the user achieves the selection. Thus, 
information concerning both the endpoints and path are considered and analyzed by the human 
user interface system. 

The interface may be advantageously applied to an operational system which has a 

25 plurality of functions, certain of which are unnecessary or are rarely used in various contexts, 
while others are used with greater frequency. In such systems, the application of functionality 
may be predictable. Therefore, the present technologies provide an optimized interface system 
which, upon recognizing a context, dynamically reconfigures the availability or ease of 
availability of functions and allows various functional subsets to be used through "shortcuts". 

30 The interface presentation will therefore vary over time, use and the particular user. 
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The advantages to be gained by using an intelligent data analysis interface for facilitating 
user control and operation of the system are more than merely reducing the average number of 
selections or time to access a given function. Rather, advantages also accrue from providing a 
means for access and availability of functions not necessarily previously existing or known to the 
5 user, improving the capabilities and perceived quality of the product. 

Further improvements over prior interfaces are also possible due to the availability of 
pattern recognition functionality as a part of the interface system. In those cases where the' 
pattern recognition functions are applied to large amounts of data or complex data sets, in order 
to provide a sufficient advantage and acceptable response time, powerful computational 

10 resources, such as powerful RISC processors, advanced DSPs or neural network processors are 
made available to the interface system. On the other hand, where the data is simple or of limited 
scope, aspects of the technology may be easily implemented as added software-based 
functionality in existing products having limited computational resources. 

The application of these technologies to multimedia data processing systems provides a 

15 new model for performing image pattern recognition and for the programming of applications 
including such data.. The ability of the interface to perform abstractions and make decisions 
regarding a closeness of presented data to selection criteria makes the interface suitable for use in 
a programmable control, i.e., determining the existence of certain conditions and taking certain 
actions on the occurrence of detected events. Such advanced technologies might be especially 

20 valuable for disabled users. 

In a multimedia environment, it may be desirable for a user to perform an operation on a 
multimedia data event. Past systems have required explicit indexing or identification of images 
and events. The present technologies, however, allow an image, diagrammatic, abstract or 
linguistic description of the desired event to be acquired by the interface system from the user 

25 and applied to identify or predict the multimedia event(s) desired, without requiring a separate 
manual indexing or classification effort. These technologies may also be applied to single media 
data. 

e interface system analyzes data from many different sources for its operation. Data 
may be stored or present in a dynamic data stream. Thus, in a multimedia system, there may be a 
30 real-time video feed, a stored event database, as well as an exemplar or model database. Further, 
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since the device is adaptive, information relating to past experience of the interface, both with 
respect to exposure to data streams and user interaction, is also stored. 

This data analysis aspect of the interface system may be substantially processor intensive, 
especially where the data includes abstract or linguistic concepts or images to be analyzed. 
Interfaces which do not relate to the processing of such data may be implemented with simpler 
hardware. On the other hand, systems which handle complex data types may necessarily include 
sophisticated processors, adaptable for use by the interface system. A portion of the data 
analysis may also overlap the functional analysis of the data for the operational system. 

Other objects and features of the present invention will become apparent from the 
following detailed description considered in conjunction with the accompanying drawings. It is 
to be understood, however, that the drawings are designed solely for the purposes of illustration 
and not as a definition of the limits of the invention, for which reference should be made to the 
appended claims. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
Embodiments of the present invention are shown in the figures in the drawings, in which: 
Fig. 1 is a flow chart of the steps required to set a VCR; 

Fig. 2 shows a graphical comparison of required and extra keypresses for the prior art and 
the interface of the present invention; 

Fig. 3 graphically shows the differences in seconds between total time for the prior art for 
each user; 

Fig. 4 graphically shows the differences in seconds between total time for the interface of 
the present invention for each user; 

Fig. 5 graphically shows the programming steps for the comparison of the prior art and 
the interface of the present invention; 

Fig. 6 graphically shows comparative statistics by user comparing the prior art and the 
interface of the present invention; 

Figs. 7 and 8 graphically show the critical steps in programming the prior art and the 
interface of the present invention; 

Fig. 9 graphically shows the number of keypresses made by test participants comparing 
the prior art and the interface of the present invention; 

Fig. 10 graphically shows the comparison of the actual and theoretical number of 
keypresses necessary for programming the prior art and the interface of the present invention; 

Fig. 11 graphically compares the actual and theoretical time necessary for programming 
the prior art and the interface of the present invention; 

Figs. 12a and 12b graphically compares the actual and theoretical time necessary for 
setting the programs in the prior art and the interface of the present invention; 

Figs. 13 and 14 graphically show the percentage time for the critical steps in 
programming the prior art and the interface of the present invention; 

Fig. 15 is a flow diagram of a predictive user interface of the present invention; 

Fig. 16 is a flow diagram of the program input verification system of the present 
invention; 

Fig. 17 is a flow diagram of a predictive user preference aware interface of the present 
invention; 
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Fig. 18 is a block diagram of a non-program information feature extraction circuit of the 
present invention; 

Fig. 19 is a diagram of a block of information for a catalog entry of the present invention; 
Fig. 20 is a block diagram of a digital information and analog signal reading/recording 
apparatus; 

Fig. 21 is a block diagram of a user level determining system of the present invention; 
Fig. 22 is a block diagram of a template-based pattern recognition system of the present 
invention; 

Fig. 23 is a block diagram of a control system of the present invention incorporating a 
pattern recognition element and an interface; 

Fig. 24 is a block diagram of a control system for characterizing and correlating a signal 
pattern with a stored user preference of the present invention; 

Fig. 25 is a block diagram of a multiple video signal input apparatus, with pattern 
recognition, data compression, data encryption, and a user interface of the present invention; 

Fig. 26 is a block diagram of a control system for matching a template with a sensor 
input, of the present invention; 

Figs. 27, 28 and 29 are flow diagrams of an iterated function system method for 
recognizing a pattern according to the present invention; 

Fig. 30 is a semi-cartoon flow diagram of the object decomposition and recognition 
method of the present invention; and 

Fig. 31 is a block diagram of an adaptive interface system according to the present 
invention. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The preferred embodiments of the present invention will now be described with reference 

to the Figures. Identical elements in the various figures are designated with the same reference 

numerals. 

EXAMPLE 1 
VCR INTERFACE 

A preferred embodiment of the interface of the present invention, described in the present 
example, provides automatic sequencing of steps, leading the user through the correct sequence 
of actions to set a program on the screen, so that no necessary steps are omitted, and no optional 
steps are accidentally or unintentionally omitted. These steps are shown diagrammatically in 
Fig. 15 of the present invention. In addition, such a system does not burden the user with the 
necessity of inputting superfluous information, nor overwhelm the user with the display of 
unnecessary data. See, Hoffberg, Linda L, "AN IMPROVED HUMAN FACTORED 
INTERFACE FOR PROGRAMMABLE DEVICES: A CASE STUDY OF THE VCR", 
Master's Thesis, Tufts University; Hoffberg, Linda L, "Designing User Interface Guidelines For 
Time-Shift Programming of a Video Cassette Recorder (VCR)", Proc. of the Human Factors Soc. 
35th Ann. Mtg. pp. 501-504 (1991); and Hoffberg, Linda I., "Designing a Programmable 
Interface for a Video Cassette Recorder (VCR) to Meet a User's Needs", Interface 91 pp. 346-351 
(1991). See also, U.S. Patent Application No. 07/812,805, incorporated herein by reference in its 
entirety, including appendices and incorporated references. 

Many design considerations were found to be important in the improved interface of the 
present invention: 

The interface should preferably employ only'minimal amounts of abbreviations and the 
use of complete words is especially preferred, except where a standard abbreviation is available 
or where an "iconic" or symbolic figure or textual cue is appropriate. Thus, standard 
abbreviations and symbols are acceptable, and displayed character strings may be shortened or 
truncated in order to reduce the amount of information that is to be displayed, where necessary or 
desirable. An option may be provided to the user to allow full words, which may decrease the 
information which may be conveyed on each screen and increase the number of screens that must 
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be displayed, or abbreviations and symbols, which may minimize the number of displayed 
screens of information, thus allowing the user to make the compromise. This aspect of the 
system may also be linked to the adaptive user level function of the present invention, wherein 
abstract symbols and abbreviations are presented to advanced users, while novices are presented 
5 with full words, based on an implicit indication of user level. These abstract symbols and 

abbreviations may be standard elements of the system, or user designated icons. Of course, the 
user could explicitly indicate his preference for the display type, thus deactivating the automatic 
adaptive user level function. 

If multiple users use the device, then the device identifies the relevant users. This may be 

10 by explicit identification by keyboard, bar code, magnetic code, smart card (which may 

advantageously include a user profile for use with a number of devices), an RF-ID or IR-ID 
transponder, voice recognition, image recognition, or fingerprint identification. It is noted that 
smart cards or other intelligent or data-containing identifications systems may be used with 
different types of devices, for example video, audio, home appliances, HVAC and automobile 

15 systems. 

Where a new user is identified to the system, an initial query may be made to determine 
an optimum initial user level. This allows further identification of the user and preference 
determination to occur more efficiently. 

In applications in which a user must program an event on a certain date, at a certain time, 

20 a built-in calendar menu screen is preferably employed so that the user cannot set the device with 
a program step that relies on a non-existent date. Technology that will help eliminate the human 
problem of setting the wrong (yet existing) date may also be employed. Such technology might 
include accessing an on-line or other type of database containing media programming 
information, and prompting the user regarding the selected choice. In situations where it is 

25 applicable, the interface should indicate to the user the number of characters the interface is 
expecting, such as when entering the year. 

The interface system provides an easily accessible CHANGE, CANCEL or UNDO 
(single or multiple level) feature, which facilitates backtracking or reprogramming the 
immediately previously entered information rather than forcing the user to repeat all or a 

30 substantial portion of the programming steps. A method of the type described is shown in Fig. 
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16 of the present invention. User input is also facilitated by the provision of frequently used 
settings as explicit choices, such as, referring to the VCR example, "Record today," "Record 
tomorrow," "Noon," and "Midnight," so that the user does not have to specify a date in these 
cases. This will eliminate extra keypresses, and reduce the programming time. In addition, this 
5 could eliminate user errors. Frequently used choices for program selections are also provided to 
the user to reduce the number of programming steps necessary and provide the user with all the 
frequently used selections. The especially preferred choices are "Once On .", "Once a Week on 
.", "Monday - Friday at .", "Everyday at .". These redundant, complex instructions reduce the 
number of keystrokes required for data entry, and reduce the amount of programming time 
10 required. 

The presently described interface system also provides, in the event that a color screen is 
available, conservatively used color coding, which allows the user to effectively and quickly 
acknowledge the function of each aspect of the screen. When programming, the preferred colors 
are royal blue for "help," red for mistakes, light blue for information previously entered, and 

15 yellow for current information being entered. Of course, other colors could be used, according to 
the user's or designer's preference, cultural differences, and display parameters. 

When viewing, it is preferable that screen colors change to indicate status changes, such 
as viewed/unviewed, or to categorize the shows. 

The interface includes a confirmation screen which displays to the user all of the 

20 categories and selections previously explicitly entered or otherwise inferred, and should be easily 
understandable. This is shown in Fig. 15 of the present invention. All of the necessary 
information is displayed on this screen, in addition to the change and cancel options, if possible. 

The entering of information on each screen is preferably consistent throughout the 
various interface options and levels. All of the screens preferably have similar layouts. 

25 "Buttons" or screen locations which are keyed to a particular function, which appear on multiple 
screens, should appear in approximately the same location on all screens. However, in certain 
cases, relatively more important information on a given screen may be displayed more 
prominently, and possibly in a different screen location, in order to reduce the search time. 
Further, when other factors dictate, each screen may be independently optimized for the 

30 prescribed function. For example, a representation of an analog clock dial may be used to set 



Hoffberg et ai. 



- 114 - 



LIH-10 



time information. However, even if the format does change, a standard scheme should be 
maintained, such as the use of a particular color to indicate that a particular program aspect has 
been changed. 

The interface should display data consistent with standards and conventions familiar to 
5 users. For, e.g., when entering dates, users are most familiar with calendars. However, this type 
of presentation of choices does not eliminate the human problem of entering incorrect 
information, e.g., setting a wrong, but existing, date. The problem of ensuring the accuracy of 
user input may be addressed by an intelligent interface which stores data concerning 
programming, user preferences, and by means of some logical method, such as Boolean logic, 

10 fuzzy logic, neural network theory, or any other system which may be used to generate a 

prediction, to determine if an entry is likely in error, by comparing the prediction with the entry. 
Of course, these predictive systems would also provide an initial default entry, so that an a priori 
most probably action or actions may be initially presented to the user. 

In addition to following conventions of information presentation to the user, the interface 

15 of the present invention may also provide emulations of other user interfaces of which a 
particular user may be familiar, even if these are not optimized according to the presently 
preferred embodiments of the present invention, or not otherwise well known. These emulations 
need not even be of the same type of device, so that a broad based standard for entry of 
information into a programmable controls, regardless of their type, may be implemented. By 

20 allowing emulation, the interface could provide compatibility with a standard or proprietary 
interface, with enhanced functionality provided by the features of the present interface. 

These enhanced functional intelligent aspects of the controller may be implemented by 
means of software programming of a simple microcomputer, or by use of more specialized 
processors, such as a Fuzzy Set Processor (FSP) or Neural Network Processor to provide real- 

25 time responsiveness, eliminating delays associated with the implementation of complex 
calculations on general purpose computing devices. 

In the various embodiments according to the present invention, various control strategies 
are employed. Depending on the application, fuzzy set processors (FSP's) may be preferred 
because they have the advantage of being easier to program through the use of presumptions or 

30 rules for making the fuzzy inferences, which may be derived by trial and error or the knowledge 
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of experts, while Neural Networks are less easily explicitly programmed and their network 
weighing values are not easily understood in the abstract, but these systems may be applied to 
learn appropriate responses from test data. Thus, neural networks tend to require extensive 
"training", while Fuzzy Set Processors may be explicitly programmed without the need of 
duplicating or simulating actual operating conditions, but may require "fine tuning". 

The most frequently used choices preferably should be displayed as the default setting. 
The screen cursor preferably appears at the "accept" screen button, when the screen is displayed. 
This default can either be set in advance, or acquired by the system. In the case of acquired 
defaults, these may be explicitly set by the user or adaptively acquired by the system through use. 
The interface of the present invention may be taught, in a "teach" mode, the preferences of the 
user, or may also acquire this information by analyzing the actual choices made by the user 
during operation of the interface and associated controller. This type of operation is shown 
schematically in Fig. 15 of the present invention. The options of "Midnight" (12:00 AM) and 
"Noon" (12:00 PM) should preferably be present, as some people often become confused when 
distinguishing between them. Icons, such as those indicative of the "sun" and the "moon", may 
also be used to facilitate data entry for AM and PM. The interface should preferably utilize an 
internal clock and calendar so that the user cannot set the time or program to record on a 
nonexistent date. Such a system could also compensate for daylight-savings time seasonal 
adjustments. 

The cursor is preferably distinctive and readily distinguished from other parts of the 
screen. This may be by color, attribute (i.e. blinking), size, font change of underlying text, or by 
other means. 

The user can preferably exit the programming sequence at any time by selecting a "Main 
Menu" button which may exist on the lower left-hand corner of every screen. The user is 
preferably provided with an adequate amount of feedback, and error messages should be 
directive in nature. Some form of an acknowledgement is preferably displayed after each entry. 
The user should preferably not be able to go to the next programming step until the current step 
has been completed. A message to convey why the user can not continue should appear when an 
attempt to prematurely continue is recognized. 
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The "help" function is available for when the user does not know what to do. The "help" 
screen(s) preferably explains the functions of each of the available buttons or functions, but may 
also be limited to those that are ambiguous. The "help" screen may also be used to indicate a 
current status of the interface and the controller. Further, the "help" function may also provide 
access to various other functions, such as advanced options and configurations, and thus need not 
be limited to merely providing information on the display. The help system may incorporate a 
hypertext-type system, wherein text or information relating to concepts that are conceptually 
linked may be easily accessed by indicating to the interface system that the related information is 
desired. To eliminate the possibility of the user trying to make selections on merely informative 
help screens, the cursor, in these cases, should be locked to a choice which returns the user to 
where they left off in the programming sequence, and this choice should be highlighted. 

The "help" function may also comprise "balloon help" similar to the system adopted by 
Apple Computer, Inc. in Macintosh Operating System, e.g., 7.0, 7.1, 7.5, etc. 

The interface preferably initiates the programming sequence where the user wants to be, 
so that the interface has so-called "smart screens". For example, when a VCR is first powered up 
or after an extended power failure, and the time and date are not stored in the machine, the "set 
date" and "set time" screens should appear. The sequence of screens may also vary depending on 
the system predicted requirements of the user and various aspects of the improved interface of 
the present invention. This is shown schematically in Fig. 17 of the present invention. 

The preferable input device for the interface of the present invention provides as few 
buttons as possible to achieve the required functionality, thus reducing potential user 
intimidation, focusing the user's attention on the interactive display screen, where the available 
choices are minimized to that number necessary to efficiently allow the user to program the 
discrete task presented. Such a minimization of discrete inputs facilitates a voice recognition 
input, which may be used as an alternative to mechanical input devices. The preferred 
embodiment includes a direct-manipulation type interface, in which a physical act of the user 
causes a proportionate change in the associated interface characteristic, such as cursor position. 
A computer mouse, e.g. a two dimensional input device, with 1 to 3 buttons is the preferred input 
device, for use with a general purpose computer as a controller, while a trackball on a remote 
control device is especially preferred for limited purpose controllers because they do not require 
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a flat surface for operation. Other stationary or movement sensitive input devices may, of course 
be used, such as joysticks, gyroscopes, sonic echo-location, magnetic or electrostatic location 
devices, RF phase location devices, Hallpots (joystick-like device with magnets that move with 
respect to Hall effect transducers), etc. The present interface minimizes the number of necessary 
keys present on an input device, while maintaining the functionality of the interface. It is noted 
that a strict minimization without consideration of functionality, might lead to inefficiency. For 
example, in a VCR device, if the user wants to record a program which airs Monday through 
Friday, he would have to set five separate programs, rather than one program if a "weeknights" 
choice is made available. 

The interface preferably should be easy to learn and should not require that a user have 
prior knowledge of the interface in order to use it. An attempt has been made to minimize the 
learning curve, i.e., to minimize the time it takes to learn how to use the device. 

Menu options are preferably displayed in logical order or in their expected frequencies. 
Research has shown that a menu-driven interface is best for applications involving new users and 
does not substantially hinder experienced users. Menu selection is preferably used for tasks 
which involve limited choices. They are most helpful for users with little or no training. Each 
menu should preferably allow only one selection at a time. Most of the information is preferably 
entered using a numeric keypad (entry method), rather than using up and down arrow keys 
(selection method). In addition, no leading zeros are required for entry. If there is more than one 
keystroke required, the user must then select an "OK" button to continue in the programming 
sequence. However, if the selection method is used, all of the choices are displayed on the 
screen at once. The number of steps required to complete the task through a sequence of menus 
should be minimized. The choice of words used to convey information should not be device 
specific, i.e., computer terms, but rather normal, everyday terms which are easy to understand. 
In addition, very few abbreviations should be used. All necessary information which the user 
needs should preferably be displayed at once. A user preferably should not have to rely on his 
memory or his previous experience, in order to find the correct choice, at least at the lower user 
levels. If all selections cannot be displayed at once, a hierarchical sequence is preferably used. 
A main menu should preferably provide a top level to which the user can always return and start 
over. 
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Searching and learning times should be kept to a minimum in order to obtain a 
subjectively better interface. The system's logic should reflect the users' expectations, offer 
visual clues and feedback, and stay within human memory limits. For example, the VCR should 
turn on not only with the "Power" button, but also when inserting a tape into the device. In 
5 addition, the sequence of steps for setting the machine to record, if the user does not indicate 
implicitly or explicitly that he knows how to use the device, should assume that the user is a 
novice, and fully prompt the user for elemental items of information. Nothing should be taken 
for granted. By developing an improved interface, an attempt is made to: reduce the searching 
time; reduce the learning time; simplify the entering of data; and, reduce the intimidation 

10 experienced by certain persons when using electronic devices. 

Tests by an inventor hereof show that people do not program their VCRs often, and they 
often forget the sequence of steps between recording sessions. Thus, the present invention 
preferably incorporates an adaptive user level interface, wherein a novice user is presented with a 
simpler interface with fewer advanced features initially available, so that there is reduced 

15 searching for the basic functions. A more advanced user is presented with more advanced 
choices and functions available initially, as compared to a novice user. 

Thus, as shown in Fig. 17, the user identifies himself to the controller in block 1701. The 
controller 1806 of Fig. 18 thereafter uses a stored profile of the identified user in controlling the 
interaction with the user, as shown in block 1702 of Fig. 17, from information stored in the 

20 database 1807 of Fig. 18 of the present invention. It has been found that in the case of novice 
users, a greater number of simple instructions may be more quickly and easily input rather than a 
potentially fewer number of a larger set of more complex instructions. It has further been found 
that, even if presented with a set of instructions which will allow a program to be entered with a 
fewer number of inputs, a novice user may choose to input the program using the simple 

25 instructions exclusively, thus employing an increased number of instructions and being delayed 
by an increased search time for those instructions that are used, from the larger set. 

Other characteristics of this interface include color coding to help prompt the user as to 
which data must be entered. Red text signifies instructions or errors, yellow text represents data 
which must be entered or has not been changed, and blue text shows newly entered program data 

30 or status information. Blue buttons represent buttons which should normally be pressed during 
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the programming sequence. Red buttons signify an erratic pattern in the data entry, such as the 
"cancel" and "return to main menu" buttons. Of course, these colors can be replaced by other 
display attributes, such as intensity, underline, reverse video, blinking and pixel dithering pattern, 
in addition to the use of various fonts. Such a situation would include a monochrome monitor or 
5 display. 

The date may be entered in the form of a calendar rather than as numbers (i.e., "9/6/91"). 
This calendar method is advantageous because users may wish to input date data in one of three 
ways: day of the week, day relative to the present, and day of the month. The present method 
allows the current date to be highlighted, so that the calendar may be used to easily enter the 

10 absolute day, absolute date, and relative day. Further, the choices "today" and "tomorrow", the 
most frequently used relative recording times, are included in addition to a month-by-month 
calendar. This information is provided to avoid an unnecessary waste of time and user 
frustration. Thus, another aspect of the present invention is to provide a partially redundant 
interactive display input system which allows, according to the highest probability, the choices to 

15 be prominently displayed and easily available, in addition to allowing random access to all 
choices. 

The present device allows common user mistakes to be recognized and possibly 
addressed, such as the confusion between 12:00 PM and 12:00 AM with midnight and noon, 
respectively. Therefore, the options of "noon" and "midnight" are provided in addition to a direct 
20 numeric clock input. When entering time information, leading zeros need not be entered, and 
such information may be entered in either fashion. 

The criteria for system acceptance of input depends on how many keystrokes are required 
on the screen. If only one keystroke is required to complete input of the information, upon 
depressing the key, the programming sequence will continue. If more than one keypress is 
25 required, the user must depress the "OK" button to continue programming. This context 
sensitive information entry serves to avoid unnecessary input. 

An on-line "help" system and on-line feedback is preferably provided to the user 
throughout various aspects of the interface. Other features include minimizing the number of 
keypresses required to program the device. These features, together with other aspects of the 
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present invention allow the user to achieve a greater efficiency with the input device than with 
prior art devices. 

The interface of the present invention applied to a VCR control preferably comprises a 
virtual keypad entry device (i.e. a representation of an array of choices), a directional input 
5 control for a cursor on a display screen, and selection buttons. The input device has an input 
corresponding to a direction of movement relative to the cursor position. Thus, since the present 
input device seeks to minimize the physical control elements of the human interface device, the 
display elements for a preferred embodiment of the present interface include: 





1. 


number keys 0-9. 


10 


2. 


enter key. 




3. 


cancel key. 




4. 


status indicator. 




5. 


return to menu option button. 




6. 


program type indicator: program once, program once a week, program 


15 




Monday-Friday, program everyday. 




7. 


Day indicators: 7 week days, today, tomorrow. 




8. 


Noon and midnight choices. 




9. 


Help button. 




10. 


Main menu options: Review, Enter new recording time, Set time, Set date. 


20 


11. 


Timer button. 




12. 


Power button. 




13. 


AM/PM choices. 




14. 


31 day calendar. 




15. 


12 month Choices. 


25 


16. 


3 tape speed choices. 



User dissatisfaction is generally proportionate to the length of " search time," the time 
necessary in order to locate and execute the next desired function or instruction. Search time 
may be minimized by the inclusion of up to a maximum of 4-8 choices per screen and by use of 
30 consistent wording and placement of items on the display. 

The present invention proceeds from the understanding that there are a number of aspects 
of a programmable interface that are desirable: 

First, users should be able to operate the system successfully, without wide disparities in 
time. It should take, e.g., a normal person interacting with a VCR interface, less than seven 
35 minutes to set the time and two programs. Searching time spent in setting the clock, 

programming, getting into the correct mode, and checking whether or not the VCR is set 
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correctly should be kept to a minimum through the appropriate choices of menu layout and the 
presentation of available choices. 

Second, programming should be a stand-alone process, and not require an instruction 
manual. A help system should be incorporated in the interface. Word choices should be 
understandable, with a reduction in the use of confusing word terminology. Error messages 
should be understandable. The system should provide the ability to cancel change or exit from 
any step. 

Third, the system should provide on-screen understandable information, with adequate 
visual feedback. The displays should be consistent. Color coding should be employed, where 
applicable, using, e.g. blue - new input; red - error condition; yellow - static, unchanged value. 
Layouts should be logical, and follow a predictable pattern. There should be a maximum of 4-8 
choices per screen to minimize searching time. Keys should be labeled with text rather than with 
ambiguous graphics. However, a combination of both may be preferable in some cases. 

Fourth, steps required to complete tasks should be simple, require a short amount of time 
and not create user frustration. The system should guide the user along a decision path, 
providing automatic sequencing of steps. The most frequently used choices should be provided 
as defaults, and smart screens may be employed. The learning curve should be minimized 
through the use of easily understandable choices. As a user becomes more sophisticated, the 
interface may present more advanced choices. 

Fifth, there should be a reminder to set the timer and to insert the tape once the 
programming information is entered. This reminder may also be automated, to eliminate the 
commonly forgotten step of setting the timer, so that the VCR automatically sets the timer as 
soon as the necessary information is entered and a tape is inserted. Once the program is set in 
memory, a message should appear if a tape is not inserted. If the VCR is part of a "jukebox" 
(automatic changer), the tape may be automatically loaded. The VCR should preferably turn on 
when a tape is inserted. In addition, users should also be able to control the VCR with a Power 
button. 

Sixth, the VCR should be programmable from both the remote device and the control 

panel. 
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Seventh, each operation should require only one keypress, if possible, or otherwise 
reduce the number of keypresses required. There should be a 12 hour clock, not a 24 hour clock. 
There should be an on-screen keypad with entry keys, not "up" and "down" selector keys, 
allowing for the choice of specific day or time entry. There should be a "start" and a "stop" 
5 recording time, rather than "start" time and "length of program" or duration exclusively. The 
number of buttons on the remote control should be minimized so that as few buttons as are 
required are provided. The input device should provide for the direct manipulation of screen 
elements. A menu driven interface should be provided. 

The interface of the present invention provides an automatic sequencing of steps which 

10 does not normally let the user think the previous step is complete. This is shown schematically 
in Fig. 16. In this manner, important steps will not be inadvertently omitted. Upon entering the 
programming sequence, if the current date or time is not set, the interface will prompt the user to 
enter this information. Thereafter, the interface will normally default to the main menu, the most 
frequently used first screen. Thus, the interface of the present invention is adaptive, in that its 

15 actions depend on the current state of the device, including prior programming or use of the 
device by the user. It can be appreciated that this adaptive behavior can be extended to include 
extended "intelligence". For example, if the device is similarly programmed on a number of 
occasions, then the default setup may be adapted to a new "normal" program mode. Further, the 
apparatus could provide multiple levels of user interface, e.g. beginner, intermediate, and 

20 advanced, which may differ for various functions, based on the behavior of the user. This user 
interface level determining feature extraction system is shown diagrammatically in Fig. 18. In 
contrast, prior art interfaces that have different user interface levels, allow the user to explicitly 
choose the interface level, which will then be used throughout the system until reset. 

The present system allows discrete tasks to be conducted more quickly, more efficiently, 

25 with reduced search time and with fewer errors than prior art systems. 

EXAMPLE 2 

SERIAL RECORDING MEDIUM INDEX 

In a preferred embodiment of the present invention, in a VCR, in order to track the 
30 content of the tape, a directory or a catalog is recorded, preferably digitally, containing the 
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programming information, as well as additional information about the recorded programs, in a 
header, i.e., at the beginning of the tape, or at other locations on the tape. The device may also 
catalog the tape contents separately, and based on an identification of the tape, use a separately 
stored catalog. A preferred format for storing information is shown in Fig. 19. 
5 Thus, if there are a number of selections on the tape, the entire contents of the tape could 

be accessible quickly, without the need for searching the entire tape. In a sequential access 
mediurri, the tape transport apparatus must still shuttle to the location of the desired material, but 
it may do so at increased speeds, because there is no need to read the tape once the location is 
determined; after the tape transport nears the desired spot, the tape may be slowed or precisely 

10 controlled to reach the exact location. 

The tape read and drive system is shown schematically in Fig. 20. The algorithm used in 
the final stage of approach to the desired portion of the tape or other recording medium may 
incorporate a control employing Fuzzy logic, Neural Networks, mathematical formulae modeling 
the system (differential equations) in a Model-based system, a Proportional-Differential-Integral 

15 (PID) system, or a controller employing an algorithm of higher order, or other known control 
methods. 

If a selection is to be recorded over, the start and stop locations would be automatically 
determined from the locations already indicated on the tape. Further, this information could be 
stored in memory device (which reads a catalog or index of the tape when a new tape is loaded) 

20 or non-volatile memory device (which stores information relating to known tapes within the 
device) or both types of memory in the VCR, so that an index function may be implemented in 
the VCR itself, without the need to read an entire tape. Optionally, a printer, such as a thermal 
label printer (available from, e.g. Seiko Instruments, Inc.), attached to the device, could be 
available to produce labels for the tapes, showing the index, so that the contents of a tape may be 

25 easily indicated. A label on the tape may also include a bar code or two-dimensional coding 
system to store content or characterization information. The stored identification and index 
information is thus stored in a human or machine readable form. 

These contents, or a list of contents, need not necessarily be manually entered by the user 
or created by the apparatus, rather, these may be derived from published data or a database, data 

30 transmitted to the control, and/or data determined or synthesized by the control itself. For 
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example, broadcast schedules are available in electronic or machine readable form, and this 
information may be used by the apparatus. 

EXAMPLE 3 
5 SERIAL DATA MEDIUM INDEX 

Another aspect of the present invention relates to the cataloging and indexing of the 
contents of a storage medium. While random access media normally incorporate a directory of 
entries on a disk, and devices such as optical juke boxes normally are used in conjunction with 
software that indexes the contents of the available disks, serial access mass storage devices, such 

10 as magnetic tape, do not usually employ an index; therefore, the entire tape must be searched in 
order to locate a specific selection. 

In the present invention, an area of the tape, preferable at the beginning of the tape or at 
multiple locations therein, is encoded to hold information relating to the contents of the tape. 
This encoding is shown in Fig. 19, which shows a data format for the information. This format 

15 has an identifying header 1901, a unique tape identifier 1902, an entry identifier 1903, a start 
time 1904, an end time 1905 and/or a duration 1906, a date code 1907, a channel code 1908, 
descriptive information 1909 of the described entry, which may include recording parameters and 
actual recorded locations on the tape, as well as a title or episode identifying information, which 
may be a fixed or variable length entry, optionally representative scenes 1910, which may be 

20 analog, digital, compressed form, or in a form related to the abstract characterizations of the 
scenes formed in the operation of the device. Finally, there are error correcting codes 191 1 for 
the catalog entry, which may also include advanced block encoding schemes to reduce the affect 
of non-Gaussian correlated errors which may occur on video tape, transmission media and the 
like. This information is preferably a modulated digital signal, recorded on, in the case of Hi-Fi 

25 VHS, one or more of the preexisting tracks on the tape, including the video, overscan area, 
Audio, Hi-Fi stereo audio, SAP or control tracks. It should be noted that an additional track 
could be added, in similar fashion to the overlay of Hi-Fi audio on the video tracks of Hi-Fi 
VHS. It is also noted that similar techniques could be used with Beta format, 8mm, or other 
recording systems, to provide the necessary indexing functions. 
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Digital data may also be superimposed as pseudonoise in the image information, or as 
other information intermixed or merged with the video information. 

The recording method is preferable a block encoding method with error correction within 
each block, block redundancy, and interleaving. Methods are known for reducing the error rate 
.5 for digital signals recorded on unverified media, such as videotape, which are subject to burst 
errors and long term non-random errors. Such techniques reduce the effective error rate to 
acceptable levels. These are known to those skilled in the art and need not be discussed herein in 
detail. A standard reference related to this topic is Digital Communications by John G. Proakis, 
McGraw-Hill (1983). The digital data recording scheme is best determined according to the 

10 characteristics of the recording apparatus. Therefore, if an, e.g. Sony Corporation helical scan 
recording/reproducing apparatus was employed, one of ordinary skill in the art would initially 
reference methods of the Sony Corporation initially for an optimal error correcting recording 
scheme, which are available in the patent literature, in the U.S., Japan, and internationally, and 
the skilled artisan would also review the known methods used by other manufacturers of digital 

15 data recording equipment. Therefore, these methods need not be explained herein in detail. 

The catalog of entries is also preferably stored in non-volatile memory, such as hard disk, 
associated with the VCR controller. This allows the random selection of a tape from a library, 
without need for manually scanning the contents of each tape. This also facilitates the random 
storage of recordings on tape, without the requirement of storing related entries in physical 

20 proximity with one another so that they may be easily located. This, in turn, allows more 
efficient use of tape, because of reduced empty space at the end of a tape. The apparatus is 
shown schematically in Fig. 20, in which a tape drive motor 2001, controlled by a transport 
control 2002, which in turn is controlled by the control 2003, moves a tape 2005 past a reading 
head 2004. The output of the reading head 2004 is processed by the amplifier/demodulator 2006, 

25 which produces a split output signal. One part of the output signal comprises the analog signal 
path 2007, which is described elsewhere. A digital reading circuit 2008 transmits the digital 
information to a digital information detecting circuit 2009, which in turn decodes the information 
and provides it to the control 2003. 

In order to retrieve an entry, the user interacts with the same interface that is used for 

30 programming the recorder functions; however, the user selects different menu selections, which 
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guide him to the available selections. This function, instead of focusing mainly on the particular 
user's history in order to predict a selection, would analyze the entire library, regardless of which 
user instituted the recording. Further, there would likely be a bias against performing identically 
the most recently executed function, and rather the predicted function would be an analogous 
function, based on a programmed or inferred user preference. This is because it is unlikely that a 
user will perform an identical action repeatedly, but a pattern may still be derived. 

It is noted that the present library functions differ from the prior art VHS tape index 
function, because the present index is intelligent, and does not require the user to mark an index 
location and explicitly program the VCR to shuttle to that location. Rather, the index is content 
based. Another advantage of the present library function is that it can automatically switch 
media and recording format, providing an adaptive and/or multimode recording system. Such a 
system might be used, for example, if a user wishes to record, e.g., "The Tonight Show With 
Johnny Carson" in highly compressed form, e.g. MPEG-2 at 200:1 compression, except during 
the performance of a musical guest, at which time the recording should have a much lower loss, 
e.g., MPEG-2 at 20:1, or in analog format uncompressed. A normal VCR could hardly be used 
to implement such a function even manually, because the tape speed (the analogy of quality 
level) cannot generally be changed in mid recording. The present system could recognize the 
desired special segment, record it as desired, and indicate the specific parameters on the 
information directory. The recorded information may then be retrieved sequentially, as in a 
normal VCR, or the desired selection may be preferentially retrieved. If the interface of the 
present invention is set to automatically record such special requests, the catalog section would 
then be available for the user to indicate which selections were recorded based upon the implicit 
request of the user. Because the interface has the ability to characterize the input and record 
these characterizations in the index, the user may make an explicit request different from the 
recording criteria, after a selection has been recorded. The controller would then search the 
index for matching entries, which could then be retrieved based on the index, and without a 
manual search of the entire tape. Other advantages of the present system are obvious to those of 
ordinary skill in the art. 
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A library system is available from Open Eyes Video, called "Scene Locator", which 
implements a non-intelligent system for indexing the contents of a videotape. See NewMedia, 
November/December 1991, p. 69. 

It is noted that, if the standard audio tracks are used to record the indexing information, 
then standard audio frequency modems and recording/receiving methods are available, adapted 
to record or receive data in half-duplex mode. These standard modems range in speed from 300 
baud to about 64 kilobits per second, e.g. v.29, v. 17, v.32, v.32bis, v.34, v. 90, v. 91, etc. While 
these systems are designed for dial-up telecommunications, and are therefore are designed for the 
limited data rates available from POTS. These are limited to a slower speed than necessary and 
incorporate features unnecessary for closed systems, they require a minimum of design effort and 
the same circuitry may be multiplexed and also be used for telecommunication with an on-line 
database, such as a database of broadcast listings, discussed above. It should be noted that a full- 
duplex modem should be operated in half duplex mode when reading or recording on a media, 
thus avoiding the generation of unnecessary handshaking signals. Alternatively, a full duplex 
receiver may be provided with the resulting audio recorded. A specially programmed receiver 
may extract the data from the recording. DTMF codes may also be employed to stored 
information. 

The Videotext standard may also be used to record the catalog or indexing information on 
the tape. This method, however, if used while desired material is on the screen, makes it difficult 
(but not impossible) to change the information after it has been recorded, without re-recording 
entire frames, because the videotext uses the video channel, during non-visible scan periods 
thereof. The video recording system according to the present invention preferably faithfully 
records all transmitted information, including SAP, VAR, close caption and videotext 
information, which may be used to implement the various functions. 

The use of on-line database listings may be used by the present interface to provide 
information to be downloaded and incorporated in the index entry of the library function, and 
may also be used as part of the intelligent determination of the content of a broadcast. This 
information may further be used for explicitly programming the interface by the user, in that the 
user may be explicitly presented with the available choices available from the database. 
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EXAMPLE 4 

CONTROLLED ENCRYPTION AND ACCOUNTING SYSTEM 

The present invention also allows for scrambling, encryption and locking of source 
material, and the receiving device selectively implements an inverse process or a partial inverse 
process for descrambling, decryption or unlocking of the material, much as the Videocipher 
series systems from General Instruments, and the fractal enciphering methods of Entertainment 
Made Convenient 2 Inc. (EMC 2 , and related companies, e.g., EMC 3 , and Iterated Systems, Inc. 
The present invention, however, is not limited to broadcasts, and instead could implement a 
system for both broadcasts and prerecorded materials. In the case of copying from one tape to 
another, such a system could not only provide the herein mentioned library functions of the 
present invention according to Example 2, it could also be used to aid in copy protection, serial 
copy management, and a pay-per-view royalty collection system. 

Such a system could be implemented by way of a telecommunication function 
incorporated in the device, shown as block 1808 of Fig. 18, or an electronic tag which records 
user activity relating to a tape or the like. Such tags might take the form of a smart card, 
PCMCIA device, or other type of storage device. A royalty fee, etc., could automatically be 
registered to the machine either by telecommunication or registry with the electronic tag, 
allowing new viewer options to be provided as compared with present VCR's. 

Numerous digital data encryption and decryption systems are known. These include 
DES, "Clipper", elliptic key algorithms, public key/private key (RSA, etc.), PGP, and others. 
Digital encryption allows a sender to scramble a message so that, with an arbitrary degree of 
difficulty, the message cannot be determined without use of a decryption key. 

An encrypted tape or other source material may be decrypted with a decryption key 
available by telecommunication with a communication center, remote from the user, in a 
decryption unit, shown schematically as the decrypt unit 1806a of Fig. 18. Such an 
encryption/decryption scheme requires special playback equipment, or at least equipment with 
decryption functionality, and thus any usage or decrypted data may be registered as a result of the 
requirement to receive a decryption key. The decryption unit may be part of an addressable 
remote unit for control of the unit remotely. 
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During acquisition of the electronic decryption key, a VCR device of an embodiment of 
the present invention would indicate its identity or electronic address, and an account is charged 
a fee for such use. The negotiation for the electronic key is also preferably encrypted. In 
addition, the decryption key may be specific for a particular decoder. Such a system could also 
be used for controlled access software, for example for a computer, wherein a remote account is 
charged for use of the software. Information communication may be through the Internet or 
through an on-line service such as America Online or Compuserve. 

Such a system differs from the normal hardware "key" or "dongle" (device which attaches 
to standard hardware port for authentication and usage limitation) because it requires on-line or 
electronic access for an encryption key, which may offer different levels of use. It also differs 
from a call-in registration, because of the automatic nature of the telecommunication. This 
presently described system differs from normal pay-per-view techniques because it allows, in 
certain instances, the user to schedule the viewing. Finally, with an encryption function 
implemented in the VCR, the device allows a user to create and distribute custom "software" or 
program material. In addition, the present controller could then act as the "telecommunication 
center" and authorize decryption of the material. 

If the source signal is in digital form, a serial copy management scheme system is 
preferably implemented. 

The present invention is advantageous in this application because it provides an advanced 
user interface for creating a program (i.e. a sequence of instructions), and it assists the user in 
selecting from the available programs, without having presented the user with a detailed 
description of the programs, i.e., the user may select the choice based on characteristics rather 
than literal description. 

In the case of encrypted program source material, it is particularly advantageous if the 
characterization of the program occurs without charging the account of the user for such 
characterization, and only charging the account if the program is viewed by the user. The user 
may make a viewing decision based on the recommendation of the interface system, or may 
review the decision based on the title or description of the program, or after a limited duration of 
viewing. Security of the system could then be ensured by a two level encryption system, wherein 
the initial decryption allows for significant processing, but not comfortable viewing, while the 
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second level of decryption allows viewing, and is linked to the accounting system. Alternatively, 
the decryption may be performed so that certain information, less than the entirety, is available in 
a first decryption mode, while other information comprising the broadcast information is 
available in a second decryption mode. 

The transmission encryption system may be of any type, but for sensitive material, i.e. 
where mere distortion of the material (e.g., loss of synchronization information and phase 
distortion) would be insufficient, an analog multiple subband transform, with spread spectrum 
band hopping and digital encryption of various control signals, would provide a system which 
would be particularly difficult for the user to view without authorization, and could be effectively 
implemented with conventionally available technology. The fractal compression and encryption 
of the EMC 2 and Iterated Systems, Inc. system is also possible, in instances where the broadcast 
may be precompressed prior to broadcast and the transmission system supports digital data. Of 
course, if a digital storage format is employed, a strict digital encryption system of known type 
may be used, such as those available from RSA. The implementation of these encryption 
systems is known to those skilled in the art. These may include the National Bureau of 
Standards (NBS), Verifiable Secret Sharing (VSS) and National Security Agency (NSA) 
encryption standards, as well as various proprietary standards. 

EXAMPLE 5 
USER INTERFACE 

In one embodiment of the present invention, the apparatus comprises a program entry 
device for a VCR or other type of media recording system. The human interface element has an 
infrared device to allow wireless communication between the human interface device and the 
VCR apparatus proper. The human interface device also includes a direct-manipulation type 
input device, such as a trackball or joystick. Of course it is understood that various known or to- 
be developed alternatives can be employed, as described above. 

It is noted that many present devices, intended for use in computers having graphic 
interfaces, would advantageously make use of an input device which is accessible, without the 
necessity of moving the user's hands from the keyboard. Thus, for example, Electronic 
Engineering Times (EET), October 28, 1991, p. 62, discloses a miniature joystick incorporated 
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into the functional area of the keyboard. This technique is directed at a different aspect of user 
interaction with a programmable device than certain preferred embodiments of the present 
invention, in that the input device does not have a minimal number of keys. While the device 
disclosed in EET is intended for use in a full function keyboard, the preferred embodiment of the 
present invention is directed towards the minimization of the number of keys and avoidance of 
superfluous keys by provision of a pointing device. Of course, the present invention could be 
used with a full function input device, where appropriate, and the joystick of EET (10/28/91, p. 
62) would be suitable in this case. 

The interface of the present invention studies the behavior and moods of the user, in 
context, during interactions to determine the expected user level of that user as well as the 
preferences of the user. These user characteristics may change over time and circumstances. 
This means that the system studies the interaction of the user to determine the skill of the user or 
his or her familiarity with the operation and functionality of the system. By determining the skill 
of the user, the system may provide a best compromise. The purpose of this feature is to provide 
a tailored interface adapted to the characteristics of the user, thus adaptively providing access to 
various features in a hierarchical manner such that a most likely feature to be used is more easily 
accessible than an unlikely feature, but that features can generally be accessed from all or most 
user levels. The user level analysis also allows the system to teach the user of the various 
functions available, particularly when it becomes apparent that the user is being inefficient in the 
use of the system to perform a given task. Therefore, the menu structure may also be adaptive to 
the particular task being performed by the user. When combined with the user level analysis 
feature, the user efficiency feature will provide a preferable interface, with reduced learning time 
and increased usability for a variety of users. 

Thus, an important concept is that the system has at least one object having a plurality of 
functions, certain of which are unnecessary or are rarely used for various applications or in 
various contexts, while these are used with greater frequency in other contexts. Further, based 
upon predetermined protocols and learned patterns, it is possible to predict which functions will 
be used and which will not be used. 

Therefore, the system, upon recognizing a context, will reconfigure the availability or 
ease of availability of functions and allow various subsets to be used through "shortcuts". Thus, 
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to some extent, the interface structure may vary from time to time based upon the use of the 
system. The prior art apparently teaches away from this concept, because it is believed to 
prevent standardization, limits the "recordability" of macros and/or instruction sheets for casual 
users and limits the availability of technical support. Each of these can be addressed, to some 
extent by the availability of a default mode (so that users can access all information), and because 
the interface is self-simplifying in case of difficulty. However, forcing all users to always work 
in a default mode limits the improvements in productivity that may be gained by a data-sensitive 
processing system, and hence this standardization for its own sake is rejected by the present 
invention. 

The improvements to be gained by using an intelligent data analysis interface for 
facilitating user control and operation of the system are more than merely reducing the average 
number of keystrokes or time to access a given function. Initial presentation of all available 
information to a new user might be too large an information load, leading to inefficiency, 
increased search time and errors. Rather, the improvements arise from providing a means for 
access of and availability to functions not necessarily known to the user, and to therefore 
improve the perceived quality of the product. 

The system to determine the sophistication of the user includes a number of storage 
registers, for storing an analysis of each act for each user. A given act is represented in a 
plurality of the registers, and a weighting system to ensure that even though an act is represented 
in a number of registers, it is not given undue emphasis in the analysis. Thus, each act of the 
user may be characterized in a number of ways, and each characteristic stored in an appropriate 
register, along with a weighting representing an importance of the particular characteristic, in 
relation to other identified characteristics and in relation to the importance of the act as a whole. 
The act is considered in context, and therefore, the stored information relates to the act, the 
sequence of acts prior to the act, acts of the user occur after the act, the results of the sequence of 
acts which include the act, and characteristics of the user which are not "acts", but rather include 
timing, mouse path efficiency, and an interaction with other users. 

An apparatus for performing a path information or efficiency determining function is 
shown schematically in Fig. 18, and in more detain in Fig. 21. Thus, for example, if a 
characteristic of the user is an unsteady hand while using the cursor control device, e.g. mouse, 
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producing a high frequency or oscillating component, the existence of this characteristic is 
detected and quantified by the high frequency signal component detector 2112, and, depending 
on the amplitude, frequency and duration (e.g. path length), may also be detected by the path 
optimization detector 2105. Once this characteristic is detected and quantified, an adaptive filter 
5 may be applied by the main control 1806 to selectively remove the detected component from the 
signal, in order to improve the reliability of the detection of other characteristics and to determine 
the intended act of the user. 

It should be noted that the various characteristic filters preferably act in "parallel" at each 
stage of the characteristic recognition, meaning that one characteristic is defined simultaneously 

10 with the detection of other characteristics, which assists in resolving ambiguities, allows for 
parallel processing by a plurality of processing elements which improves real-time recognition 
speed, and allows a probability-based analysis to proceed efficiently. Such a "parallel" 
computation system is included in a neural net computer, and a hardware-implementation of a 
neural net/fuzzy logic hybrid computer is a preferred embodiment, which allows fuzzy rules to be 

15 programmed to provide explicit control over the functioning of the system. It is preferred that a 
human programmer determine the basic rules of operation of the system, prior to allowing a 
back-propagation of errors learning algorithm to improve and adapt the operation of the system. 

The adaptive system implemented according to the present invention, by detecting a user 
level, allows a novice user to productively interact with the system while not unnecessarily 

20 limiting the use of the adaptive interface by an advanced user, who, for example, wishes to move 
the cursor quickly without the limiting effects of a filter which slows cursor response. 

Another example of the use of an adaptive user interface level is a user who repeatedly 
requests "help" or user instructions, through the explicit help request detector 2115, which causes 
an output from the current help level output 2102; such a user may benefit from an automatic 

25 context-sensitive help system, however such a system may interfere with an advanced user, and 
is unnecessary in that case and should be avoided. This adaptive user interface level concept is 
not limited to a particular embodiment of the present invention, such as a VCR, and in fact, may 
be broadly used wherever a system includes an interface which is intended for use by both 
experienced and inexperienced users. This differs from normal help systems which must be 

30 specifically requested, or "balloon help" (Apple Computer, Macintosh System 7.0, 7.1, 7.5) 
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which is either engaged or disengaged, but not adaptive to the particular situation based on an 
implicit request or predicted need. In the case of a single user or group of users, the interface 
could maintain a history of feature usage for each user, as in the past user history block 2107, and 
provide a lower user interface level for those features which are rarely used, and therefore less 
5 familiar to the user, through the current user level output 2101. 

It should be noted that the present system preferably detects an identity of a user, and 
therefore differentiates between different users by an explicit or implicit identification system. 
Therefore, the system may accumulate information regarding users without confusion or 
intermingling. 

10 

EXAMPLE 6 

VCR PROGRAMMING PREFERENCE PREDICTION 

The device according to the present invention is preferably intelligent. In the case of a 
VCR, the user could also input characteristics of the program material that are desired, and 

15 characteristics of that program material which is not desired. The device would then, over time, 
monitor various broadcast choices, and determine which most closely match the criteria, and thus 
be identified. For example, if the user prefers "talk-shows", and indicates a dislike for "situation 
comedies" ("sitcoms"), then the device could scan the various available choices for 
characteristics indicative of one or the other type of programming, and perform a correlation to 

20 determine the most appropriate choice(s). A sitcom, for example, usually has a "laugh track" 
during a pause in normal dialogue. The background of a sitcom is often a confined space (a 
"set"), from different perspectives, which has a large number of "props" which may be common 
or unique. This set and the props, however, may be enduring over the life of a show. 

A talk-show, on the other hand, more often relies on actual audience reaction (possibly in 

25 response to an "applause" sign), and not prerecorded or synthesized sounds. The set is simple, 
and the broadcast often shows a head and neck, or full body shot with a bland background, likely 
with fewer enduring props. A signal processing computer, programmed for audio and/or video 
recognition, is provided to differentiate between at least the two types with some degree of 
efficiency, and with a possibly extended sampling time, have a recognition accuracy, such that, 

30 when this information is integrated with other available information, a reliable decision may be 
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made. The required level of reliability, of course, will depend on the particular application and a 
cost-benefit analysis for the required system to implement the decision-making system. 

Since the system according to the present invention need not display perfect accuracy, the 
preferred embodiment according to the present example applies general principles to new 
situations and receives user or other feedback as to the appropriateness of a given decision. 
Based on this feedback, subsequent encounters with the same or similar data sets will produce a 
result which is "closer" to an optimal decision. Therefore, with the aid of feedback, the search 
criterion would be improved. Thus, a user could teach the interface through trial and error to 
record the desired broadcast programs. Thus, the presently described recognition algorithms may 
be adaptive and learning, and need not apply a finite set of predetermined rules in operation. For 
such a learning task, a neural network processor may be implemented, as known in the art. 

The feature extraction and correlation system according to the present invention is shown 
in Fig. 22. In this figure, the multimedia input, including the audio signal and all other available 
data, are input in the video input 2201. The video portion is transferred to a frame buffer 2202, 
which temporarily stores all of the information. Ail other information in the signal, including 
audio, VIR, videotext, close caption, SAP (second audio program), and overscan, is preferably 
stored in a memory, and analyzed as appropriate. The frame buffer 2202 may have an integral or 
separate prefiltering component 2203. The filtered signal (s) are then passed to a feature extractor 
2204, which divides the video frame into a number of features, including movement, objects, 
foreground, background, etc. Further, sequences of video frames are analyzed in conjunction 
with the audio and other information, and features relating to the correlation of the video and 
other information, e.g., correlation of video and audio, are extracted. Other information is also 
analyzed and features extracted, e.g., audio and close caption. All extracted features relating to 
the multimedia input are then passed to a transform engine or multiple engines in parallel, 2205. 
These transform engines 2205 serve to match the extracted features with exemplars or standard 
form templates in the template database 2206. 

It should be noted that even errors or lack of correlation between certain data may provide 
useful information. Therefore, a mismatch between audio and close caption or audio and SAP 
may be indicative of useful information. For non-video information, exemplars or templates are 
patterns which allow identification of an aspect of the signal by comparing the pattern of an 
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unidentified signal with the stored pattern. Thus, the voice patterns of particular persons and 
audio patterns of particular songs or artists may be stored in a database and employed to identify 
a source signal. 

The transformed extracted features and the templates are then correlated by a correlator or 
5 correlators 2207. The parallelization of implementation of the transforms and correlators serves 
to increase the recognition speed of the device. It should be understood that appropriate systems 
for parallelization are known in the art. For example, the TMS 320C80, also known as the TI 
MVP (Texas Instruments multimedia video processor) contains four DSP engines and a RISC 
processor with a floating point unit on a single die. A board including a TMS 320C80 is 

10 available from General Imaging Corp., Billerica MA, the S/IP80, which may be programmed 
with ProtoPIPE. In addition, a board including a TMS 320C80 is also available from Wintriss 
Engineering Corp., San Diego, CA. Multiple MVP processors may also be parallelized for 
additional computing power. The MVP may be used to analyze, in parallel, the multimedia input 
signal and correlate it with stored patterns in a database. In this context, correlation does not 

15 necessarily denote a strict mathematical correlation, but rather indicates a comparison to 

determine the "closeness" of an identified portion of information with an unidentified portion, 
preferably including a reliability indicator as well. For neural network-based processing, specific 
hardware accelerators also available, such as from Nestor, Inc. and Intel. Therefore, since there 
may be multiple recognizable aspects of the unidentified data, and various degrees or genericness 

20 of the characteristic recognized, it is preferred that at this initial stage of the recognition process 
that the output of the correlators 2207 be a data set, e.g. a matrix, series of pointers, or other 
arrangement, so that sufficient information is available for higher level processing to allow 
application of an appropriate decision process. Of course, if the characteristic to be detected is 
simple and well defined, and the decision-making process may be implemented with a simple 

25 correlation result, then a complex data set output is not required. In fact, the output of the 

correlator may have a number of different forms, based on the context of the recognition process. 

If, for example, an exact match to an entire frame is sought, partial match information is 
not particularly useful, and is ignored in this process. (Of course, since the system is "self- 
learning", the processing results may be maintained and analyzed for other purposes). If the 
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system, on the other hand, is analyzing novel data, a full analysis would likely be necessary 
including partial results and low correlation results. 

The outputs of the correlators are input into an adaptive weighing network 2208, to 
produce a probability of a match between a given feature and a given template. The recognition 
is completed in an identifier 2209, which produces a signal identifying one or more objects in the 
video frame input. The identifier 2209 also has an output to the template database 2206, which 
reinforces the recognition by providing feedback; therefore, if the same object appears again, it 
will be more easily recognized. The template database 2206 therefore also has an input from the 
feature extractor 2204, which provides it with information regarding the features recognized. It 
is also noted that, in addition to allowing recognition, the parallel transform engines 2205, 
correlators 2207, and adaptive weighing network 2208 also allows the system to ignore features 
that, though complex, do not aid in recognition. 

For example, during dialogue, the soundtrack voice may correlate with the mouth 
movements. Thus, the mouth movements aid little in recognition, and may be virtually ignored, 
except in the case where a particular person's mouth movements are distinctive, e.g., Jim Nabors 
("Gomer Pyle n ), and Tim Curry ("Rocky Horror Picture Show'*). Thus, the complexity and 
parallelism in the intermediate recognition stages may actually simplify the later stages by 
allowing more abstract features to be emphasized in the analysis. Animation poses a special 
example where audio and image data may be separated, due to the generally non-physiologic 
relation between the image and soundtrack. 

The pattern recognition function of the present invention could be used, in a VCR 
embodiment according to the present invention to, e.g., to edit commercials out of a broadcast, 
either by recognition of characteristics present in commercials, in general, or by pattern 
recognition of specific commercials in particular, which are often repeated numerous times at 
various times of the day, and on various broadcast channels. Therefore, the system may acquire 
an unidentified source signal, which may be, for example, a 30 second segment, and compare 
this with a database of characteristics of known signals. If the signal does not match any 
previously known or identified signals, it is then subject to a characterization which may be the 
same or different than the characterization of the identified signals. The characterizations of the 
unidentified signal are then compared to characteristics to be recognized. If the unidentified 
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signal meets appropriate criteria, a presumptive generic characterization is made. This 
characterization is preferably confirmed by a user Later, so that a positively identified signal is 
added to the database of identified signals; however, under certain circumstances no confirmation 
is required. 

Certain media present a recognizable audio or video cue when a commercial break has 
ended. (E.g. often sports events, such as the Olympic Games, will have theme music or 
distinctive images). The present device need not respond immediately to such cues, and may 
incorporate a delay, which would store the information while a decision is being made. In the 
case of a video tape, the delay may be up to the time between the time of recording and the time 
of playback. Further, the temporary storage medium may be independent of the pattern 
recognition system. Thus, a system provided according to the present invention may actually 
include two independent or semi-independent data streams: the first serving as the desired signal 
to be stored, retaining visually important information, and the second providing information for 
storage relating to the pattern recognition system, which retains information important for the 
recognition process, and may discard this information after the pattern recognition procedure is 
complete. 

A system which provides a plurality of parallel data streams representing the same source 
signal may be advantageous because is allows a broadcast quality temporary storage, which may 
be analog in nature, to be separate from the signal processing and pattern recognition stage, 
which may be of any type, including digital, optical, analog or other known types, which need 
only retain significant information for the pattern recognition, and therefore may be highly 
compressed (e.g. lossy compression), and devoid of various types of information which are 
irrelevant or of little importance to the pattern recognition functions. Further, the temporary 
storage may employ a different image compression algorithm, e.g. MPEG-4, MPEG-2 or MPEG- 
1, which is optimized for retention of visually important information, while the recognition 
system may use a compression system optimized for pattern recognition, which may retain 
information relevant to the recognition function which is lost in other compression systems, 
while discarding other information which would be visually important. Advantageously, 
however, the analysis and content transmission streams are closely related or consolidated, such 
as MPEG-7 and MPEG-4. 
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In a particularly advantageous arrangement, the compression algorithm is integral to the 
recognition function, preparing the data for the pattern matching and characterization, and 
therefore is optimized for high throughput. According to this embodiment, the initial 
compression may include redundant or uncompressed information, if necessary in order to 
achieve real-time or near real-time recognition, and, thus may actually result in a larger 
intermediate data storage requirement than the instantaneous data presented to the recognition 
system; however, the term "compression", in this case, applies to the long term or steady state 
status of the device, and in a real-time recognition function, the amount of data stored for use in 
recognition is preferably less than the cumulative amount of data presented, except during the 
very initial stages of data acquisition and possibly rare peaks. 

In the case where a high quality (low loss, e.g. broadcast quality) intermediate storage is 
employed, after a decision is made as to whether the data should be stored permanently or 
otherwise further processed or distributed, the data may be transferred to the appropriate system 
or subsystem of the apparatus. Alternatively, the high quality intermediate storage is retained, 
and no further processing is performed. In either case, the purpose of this storage is to buffer the 
source data until the computational latency resolves any decisions which must be made. 

According to one aspect of the present invention, the source image may be compressed 
using the so called "fractal transform", using the method of Barnsley and Sloan, which is 
implemented and available as a hardware accelerator in product form from Iterated Systems, Inc., 
Norcross, GA, as the Fractal Transform Card (FTC) II, which incorporates eight fractal transform 
integrated circuit chips, 1 MByte of Random Access Memory (RAM), and an Intel i80960CA-25 
P, and operates in conjunction with P.OEM™ (Iterated Systems, Inc., Norcross, GA) software, 
which operates under MicroSoft-Disk Operating System (MS-DOS). FTC-II hardware 
compression requires approximately 1 second per frame, while software decompression on an 
Intel 80486-25 based MS-DOS computer, using "Fractal Formatter" software, can be performed 
at about 30 frames per second, which allows approximately real time viewing. The Fractal 
' Video Pro 1.5 is a video codec for WIN, allowing software only playback at 15-30 fps, 70-150 
Kbytes/sec. This is a non-symmetrical algorithm, requiring more processing to compress than to 
decompress the image. The FTC-IV Compression Accelerator Board is presently available. 
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This fractal compression method potentially allows data compression of upwards of 
2000:1, while still maintaining an aesthetically acceptable decompressed image result. Further, 
since the method emphasizes structural aspects of the image, as opposed to the frequency 
decomposition used in DCT methods (JPEG, MPEG), elements of the fractal method could be 
5 used as a part of the image recognition system. Of course, it should be appreciated that other 
fractal processing methods are available and may be likewise employed. 

Audio data is also compressible by means of fractal transforms. It is noted that the audio 
compression and image recognition functions cannot be performed on the FTC-II board, and 
therefore an alternate system must be employed in order to apply the pattern recognition aspects 

10 of the present invention. It should also be noted that an even more efficient compression-pattern 
recognition system could be constructed by using the fractal compression method in conjunction 
with other compression methods, which may be more efficient under certain circumstances, such 
as discrete cosine transform (DCT), e.g. JPEG or modified JPEG or wavelet techniques. Fractal 
compression systems are also available from other sources, e.g. the method of Greenwood et aL, 

15 Netrologic Inc., San Diego, CA. See also, Shepard, J.D., "Tapping the Potential of Data 
Compression", Military and Aerospace Electronics, May 17, 1993, pp. 25-27. 

A preferred method for compressing audio information includes a model-based 
compression system. This system may retain stored samples, or derive these from the data 
stream. The system preferably also includes high-level models of the human vocal tract and 

20 vocalizations, as well as common musical instruments. This system therefore stores information 
in a manner.which.allo.ws. faithful-reproduction of the audio content and also provides emphasis 
on the information-conveying structure of the audio signal. Thus, a preferred compression for 
audio signals retains, in readily available form, information important in a pattern recognition 
system to determine an abstract information content, as well as to allow pattern matching. Of 

25 course, a dual data stream approach may also be applied, and other known compression methods 
may be employed. 

Because of the high complexity of describing a particular signal pattern or group of audio 
or image patterns, in general, the system will leam by example, with a simple identification of a 
desired or undesired pattern allowing analysis of the entire pattern, and extraction of 
30 characteristics thereof for use in preference determination. 
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Barnsley and Sloan's method for automatically processing digital image data consisting 
of image information, disclosed in U.S. Patents 5,065,447 and 4,941,193, both expressly 
incorporated herein by reference, consists of the steps of storing the image data in the data 
processor, then generating a plurality of uniquely addressable domain blocks from the stored 
image data, each of the domain blocks representing a different portion of the image information 
such that all of the image information is contained in at least one of the domain blocks. A 
plurality of uniquely addressable mapped range blocks corresponding to different subsets of the 
stored image data are created, from the stored image data, with each of the subsets having a 
unique address. This step includes the substep of executing, for each of the mapped range 
blocks, a corresponding procedure upon the one of the subsets of the stored image data which 
corresponds to the mapped range block. Unique identifiers are then assigned to corresponding 
ones of the mapped range blocks, each of the identifiers specifying for the corresponding mapped 
range block a procedure and a address of the corresponding subset of the stored image data. For 
each of the domain blocks, the one of the mapped range blocks which most closely corresponds 
according to predetermined criteria is selected. Finally, the image information is represented as a 
set of the identifiers of the selected mapped range blocks. This method allows a fractal 
compression of image data. In particular, Drs. Barnsley and Sloan have optimized the match of 
the domain blocks with the mapping region by minimizing the Hausdorff distance. A 
decompression of the data precedes analogously in reverse order starting with the identifiers and 
the mapping regions to produce a facsimile of the original image. This system is highly 
asymmetric, and requires significantly more processing to compress than to decompress. 
Barnsley and Sloan do not suggest a method for using the fractal compression to facilitate image 
recognition, which is a part of the present invention. 

Basically, the fractal method proceeds from an understanding that real images are made 
up of a plurality of like subcomponents, varying in size, orientation, etc. Thus, a complex block 
of data may be described by reference to the subcomponent, the size, orientation, etc. of the 
block. The entire image may thus be described as the composite of the sub-images. This is what 
is meant by iterative function systems, where first a largest block is identified, and the pattern 
mapping is repetitively performed to describe the entire image. 
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The Iterated Systems, Inc. FTC-II or FTC-IV board, if applied as a part of a system 
according to the present invention, is preferably used in conjunction with a frame-grabber board, 
such as Matrox, Quebec, Canada, Image-LC board, or a Data Translation DT1451, DT2651, 
DT2862, DT2867, DT2861 or DT2871, which may perform additional functions, such as 
preprocessing of the image signal, and may be further used in conjunction with an image 
processing system, such as the Data Translation DT2878. Of course, it should be understood that 
any suitable hardware, for capturing, processing and storing the input signals, up to and including 
the state of the art, may be incorporated in a system according to the present invention without 
exceeding the scope hereof, as the present invention is not dependent on any particular 
subsystem, and may make use of the latest advances. For example, many modern systems 
provide appropriate functionality for digital video capture, either uncompressed, mildly 
compressed, or with a high degree of compression, e.g., MPEG-2. 

The Texas Instruments TMS320C80 provides a substantial amount of computing power 
and is a preferred processor for certain computationally intensive operations involving digital 
signal processing algorithms. A system employing a parallel TMS 320C40 processors may also 
be used. The Intel Pentium series (or related processors from AMD, National Semiconductor, or 
other companies), DEC/Compaq Alpha, SPARC, or other processors intended for desktop 
computing may, either individually or in multiprocessor configurations, be used to process 
signals. 

A pattern recognition database system is available from Excalibur Technologies, San 
Diego, CA. Further, IBM has had pattern recognition functionality available for its DB/2 
database system, and has licensed Excalibur's XRS image retriever recognition software for 
DB/2. See, Lu, C, "Publish It Electronically", Byte, September 1993, pp. 94-109. Apple 
Computer has included search by sketch and search by example functions in PhotoFlash 2.0. See 
also, Cohen, R., "FuilPixelSearch Helps Users Locate Graphics", MacWeek, August 23, 1993, p. 



77. 



Image processing hardware and systems are also available from Alacron, Nashua NH; 
Coreco, St. Laurent, Quebec; Analogic, and others. 

A fractal-based system for real-time video compression, satellite broadcasting and 
decompression is also known from Iterated Systems, Inc. and Entertainment Made Convenient 2 , 
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Inc. (EMC 2 ). In such a system, since the compressed signal is transmitted, the remote receiving 
system need not necessarily complete decompression prior to the intelligent pattern recognition 
function of the present invention. This system also incorporates anti-copy encryption and royalty 
and accounting documentation systems. It is noted that the EMC 2 system does not incorporate 
5 the intelligent features of the present invention. 

A preferred fractal-based system according to the present information provides the source 
data preprocessed to allow easy and efficient extraction of information. While much 
precharacterization information may be provided explicitly, the preferred system allows other, 
unindexed information to also be extracted from the signal. Further, the preferred system 

10 provides for an accounting system which facilitates pay-per-view functions. Thus, the interface 
of the present invention could interact with the standard accounting system to allow royalty- 
based recording or viewing, and possibly implement a serial-copy recording prevention system. 
Prior art systems require a user to explicitly select a program, rather than allow an intelligent 
system to assist in selection and programming of the device. The EMC 2 system is described in 

15 "EMC 2 Pushes Video Rental By Satellite", Electronic Engineering Times, December 2, 1991, 
p.l, p. 98. See also, Yoshida, J., "The Video-on-demand Demand", Electronic Engineering 
Times, March 15, 1993, pp. 1, 72. 

Fractal techniques may be used to store images on a writable mass storage medium, e.g. 
CD-ROM compatible. The present system may thus be used to selectively access data on the 

20 CD-ROM by analyzing the images, without requiring full decompression of the image data. 

Wavelets hold promise for efficiently describing images (i.e., compressing the data) while 
describing morphological features of the image. However, in contrast to wavelet transforms 
which are not intended to specifically retain morphological information, the selection of the 
particular wavelet and the organization of the algorithm will likely differ. In this case, the 

25 transform will likely be more computationally complex and therefore slower, while the actual 
compression ratios achieved may be greater. 

Thus, one embodiment of the device according to the present invention may incorporate a 
memory for storing a program, before being transferred to a permanent storage facility, such as 
tape. Such a memory may include a hard disk drive, magnetic tape loop, a rewritable optical disk 

30 drive, or semiconductor memories, including such devices as wafer scale memory devices. This 
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is shown diagrammatically as the intermediate storage 2210 of Fig. 22. The capacity of such a 
device may be effectively increased through the use of image data compression, which may be 
proprietary or a standard format, i.e. MPEG-1, MPEG-2 (Motion Picture Experts Group standard 
employing DCT encoding of frames and interframe coding), MPEG-4 (Motion Picture Experts 
Group standard employing DCT encoding of frames and interframe coding, as well as model- 
based encoding methods) JPEG (Joint Photographic Experts Group standard employing DCT 
encoding of frames), Px64 (Comite Consultatif International des Telegraph et telephone 
(International telegraph and telephone consultative committee) (CCITT) standard H.261, 
videoconferencing transmission standard), DVI (Digital Video Interactive), CDI (Compact Disk 
Interactive), etc. 

Standard devices are available for processing such signals, available from 8x8, Inc., C- 
Cube, Royal Philips Electronics (TriMedia), and other companies. Image processing algorithms 
may also be executed on general purpose microprocessor devices. 

Older designs include the Integrated Information Technology, Inc. (IIT, now 8x8, Inc.) 
Vision Processor (VP) chip, Integrated Information Technology Inc., Santa Clara, CA, the 
C-Cube CL550B (JPEG) and CL950 (MPEG decoding), SGS-Thompson STI3220, STV3200, 
STV3208 (JPEG, MPEG, Px64), LSI Logic L64735, L64745 and L64765 (JPEG) and Px64 chip 
sets, and the Intel Corp. 1750B DVI processor sets (82750PB, 82750DB). Various alternative 
image processing chips have been available as single chips and chip sets; in board level products, 
such as the Super Motion Compression and Super Still-Frame Compression by New Media 
Graphics of Billerica, MA, for the Personal Computer-Advanced technology (PC-AT, an IBM 
created computer standard) bus; Optibase, Canoga Park, CA (Motorola Digital Signal Processor 
(DSP) with dedicated processor for MPEG); NuVista+ from Truevision (Macintosh video 
capture and output); New Video Corp. (Venice, CA) EyeQ Delivery board for Macintosh NuBus 
systems (DVI); Intel Corp. ActionMedia II boards for Microsoft Windows and IBM OS/2 in 
Industry Standard Adapter (ISA, the IBM-PC bus standard for 8 (PC) or 16 bit (PC-AT) slots); 
Micro Channel Architecture (MCA) (e.g., Digital Video Interactive (DVI), Presentation Level 
Video (PLV) 2.0, Real Time Video (RTV) 2.0) based machines; and as complete products, such 
as MediaStation by VideoLogic. 
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Programmable devices, including the Texas Instruments TMS320C80 MVP (multimedia 
video processor) may be used to process information according to standard methods, and further 
provide the advantage of customizability of the methods employed. Various available DSP chips, 
exemplary board level signal processing products and available software are described in more 
detail in "32-bit Floating-Point DSP Processors", EDN, November 7, 1991, pp. 127-146. The 
TMS320C80 includes four DSP elements and a RISC processor with a floating point unit. 

It is noted that the present interface does not depend on a particular compression format 
or storage medium, so that any suitable format may be used. The following references describe 
various video compression hardware: Kim, Y., "Chips Deliver Multimedia", Byte, December 
1991, pp. 163-173; and Donovan, J., "Intel/IBM's Audio-Video Kernel", Byte, December, 1991, 
pp. 177-202. 

It should also be noted that the data compression algorithm applied for storage of the 
received data may be lossless or lossy, depending on the application. Various different methods 
and paradigms may be used. For example, DCT (discrete cosine transform) based methods, 
wavelets, fractals, and other known methods may be used. These may be implemented by 
various known means. A compressed image may also be advantageously used in conjunction 
with the image recognition system of the present invention, as described above. In such a case, 
the compression system would retain the information most important in the recognition function, 
and truncate the unimportant information. 

A further method of performing pattern recognition, especially of two dimensional 
patterns, is optical pattern recognition, where an image is correlated with a set of known image 
patterns represented on a hologram, and the product is a pattern according to a correlation 
between the input pattern and the provided known patterns. Because this is an optical technique, 
it is performed nearly instantaneously, and the output information can be reentered into an 
electronic digital computer through optical transducers known in the art. Such a system is 
described in Casasent, D., Photonics Spectra, November 1991, pp. 134-140. See also references 
cited therein. 

These optical recognition systems are best suited to applications where an 
uncharacterized input signal frame is to be compared to a finite number of visually different 
comparison frames (i.e., at least one, with an upper limit generally defined by the physical 
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limitations of the optical storage media and the system for interfacing to the storage media), and 
where an optical correlation will provide useful information. Thus, if a user wished to detect one 
of, e.g., "David Letter-man", "Jay Leno", or "David Koppel", a number of different planar views, 
or holograms in differing poses, of these persons would be formed as a holographic correlation 
5 matrix, which could be superimposed as a multiple exposure, stacked in the width dimension, or 
placed in a planar matrix, side by side. The detection system produces, from the uncharacterized 
input image and the holographic matrix, a wavefront pattern that is detectable by photonic 
sensors. 

It is preferred that if multiple holographic images of a particular characterization are 
10 employed, that they each produce a more similar resulting wavefront pattern than the holographic 
images of other characterizations, in order to enhance detection efficiency. The optical pattern 
recognition method is limited in that a holographic image must be prepared of the desired pattern 
to be detected, and that optically similar images might actually be of a different image, if the 
differences are subtle. However, this method may be used in conjunction with electronic digital 
15 pattern recognition methods, to obtain the advantages of both. Methods are also known to 

electronically write an image to a holographic storage medium, thereby facilitating its use in a 
general-purpose image recognition system. Of course, the system may also be used to identify 
talk show guests, such as "Richard Gere" or "Cindy Crawford", or these same individuals in 
other contexts. 

20 If image compression is used, once an image is compressed, it need not be decompressed 

and returned to pixel, NTSC or other standard transmission or format for storage on tape, and 
thus the compressed image information may be stored in the same format as is present in the 
temporary storage medium. Thus, the block labeled intermediate processing 2211 of Fig. 22 
shows that the intermediate storage need not retain the information as received from the frame 

25 buffer 2202, and in fact, may prepare it for the feature extractor 2204. In addition, the storage 
medium itself need not be normal videotape (S-VHS, VHS, Beta, 8mm, Hi-8) and may be an 
adapted analog storage technique or a digital storage technique. Various magneto-optical 
recording techniques are known, which can store between 128 MB (3 l A") and around 5 GB (11"), 
uncompressed, which might be suitable for storing compressed digital or analog information. 
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Multilayer CD-ROM and short wavelength (e.g., blue) laser systems allow storage densities of 
about 3.5 to 10 Gbytes per disk, allowing storage of over two hours of MPEG-2 encoded video. 

It is also noted that the present technology could also be applied to any sort of mass 
storage, such as for a personal computer. In such a case, a characteristic of the computer file. 
5 which is analogous to the broadcast program in temporary storage of a VCR, is classified 
according to some criteria, which may be explicit, such as an explicit header or identifying 
information, or implicit, such as a document in letter format, or a memorandum, as well as by 
words and word proximity. In particular, such a recognition system could differentiate various 
clients or authors based on the content of the document, and these could be stored in different 

10 manners. The text analysis system of a text-based computer storage system is analogous to the 
program classification system of the VCR embodiment of the present invention. However, there 
is a further analogy, in that the VCR could incorporate optical character recognition of text 
displayed in the program material, employ voice recognition, or directly receive text information 
as a part of a closed caption or videotext system. Thus, the VCR device according to the present 

15 invention could recognize and classify programs based on textual cues, and make decisions based 
on these cues. This might also provide a simple method of discriminating program material, for 
example, if a commercial does not include close caption or Second Audio Program (SAP), while 
the desired program does, or vice versa, then a commercial could be discriminated from a 
program with very little computational expenditure. 

20 

EXAMPLE 7 
VCR INTERFACE 

A particular VCR interface system according to one aspect of the present invention 
includes an internal clock, four program memory, and the capability to display a graphical color 
25 interface. By providing the user with the aforementioned features, this design is a unique 

implementation for an instrument to be used for programming an event driven controller via an 
interactive display. All information that the user needs is displayed on the screen to avoid or 
minimize the unnecessary searching for information. This information includes the current date 
and current time. 
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A simulation of the AKAI Inc. VCR VS303U (on-screen programming) and the interface 
of the present invention, were tested to evaluate users 1 performances. The AKAI interface of the 
prior art, hereinafter referred to as the prior art interface, was chosen because users made the 
fewest errors while using this machine, and no user quit while programming, as compared to 
5 three other VCRs tested, a Panasonic (made by Matsushita, Inc.) PV4962 (Bar Coder), an RCA 
brand (formerly Radio Corporation of America, Inc.) VKP950 (on-screen programming), 
Panasonic brand (made by Matsushita Inc.) PV4700 (Display Panel). 

The present embodiment was constructed and tested using HyperPAD™, a rapid 
prototyping package for an IBM-PC Compatible Computer. It is, of course obvious that the 

10 present embodiment could be incorporated in a commercial VCR machine by those skilled in the 
art, or be implemented on many types of general purpose computers with output screens which 
allow on-screen feedback for the programming operation. Further, the system of the present 
embodiment can include a remote-control device which communicates with a VCR through an 
infrared beam or beams, and can thus exert control over an infrared remote controlled VCR, or 

15 translate the programming information and communicate through an infrared remote control, 
using the standard type infrared transmitter. 

An IBM PC-AT compatible (MS-DOS, Intel 80286-10 MHz) computer was used to test 
the two simulations. In order to simulate the use of a remote control device in programming the 
VCR, an infrared device made by NView™ was attached to the computer. This device came 

20 with a keyboard that was used to "teach" a Memorex™ Universal Remote so that the desired 
actions could be obtained. By using a universal remote, the computer could be controlled by 
using a remote control. 

The present embodiment incorporates a mouse input device. It is understood that a small 
trackball with a button for selection, mounted on a remote control may also be employed, and 

25 may be preferable in certain circumstances. However, a computer mouse is easily available, and 
the mouse and trackball data are essentially similar for the type of task implemented by the user, 
with trackball performance being slightly faster. For daily use on a VCR however, a trackball 
would be a more preferable input device because it does not require a hard, flat surface, which is 
not always available to a user when programming a VCR, such as in the situation where a person 

30 is watching television while sitting in a chair or sofa. 
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A Genius M Mouse was used as the input device in the prototype of the interface of the 
present invention. With the mouse, the user could view all of the choices at once on the display 
screen, and then make a selection from the items on the screen by moving the cursor and then 
pressing the left mouse button. 
5 The interface of the present example focuses on attending to the user's needs, and the 

interface must be modified for each application. By reducing the searching, learning times, and 
entry : times, the mental load is also minimized. Some tradeoffs are necessary as a result of 
subjective and objective data. Because of the difficulty in optimizing a single interface design 
for all levels of users, a menu system was used in an attempt to satisfy all these user types. 
10 The interface of the present example reduced the number of incorrect recordings by 50%. 

The severity of the errors is unimportant here because one wrong entry will cause an 
irretrievable mistake and the user will not record the intended program. One study reported that 
faulty inputs, which lead to missing the program, can be reported by almost every present day 
owner of a VCR. 

15 

EXAMPLE 8 

PROGRAMMABLE DEVICE INTERFACE 

It is also noted that the interface of the present invention need not be limited to 
audio-visual and multimedia applications, as similar issues arise in various programmable 

20 controller environments. Such issues are disclosed in Carlson, Mark A., "Design Goals for.an 
Effective User Interface", Electro/82 Proceedings, 3/1/1-3/1/4; Kreifeldt, John, "Human Factors 
Approach to Medical Instrument Design", Electro/82 Proceedings, 3/3/1-3/3/6; Wilke, William, 
"Easy Operation of Instruments by Both Man and Machine", Electro/82 Proceedings, 
3/2/1-3/2/4; Green, Lee, "Thermo Tech: Here's a common sense guide to the new thinking 

25 thermostats", Popular Mechanics, October 1985, 155-159; Moore, T.G. and Dartnall, "Human 
Factors of a Microelectronic Product: The Central Heating Timer/Programmer", Applied 
Ergonomics, 1983,. Vol. 13, No.l, 15-23; and "The Smart House: Human Factors in Home 
Automation", Human Factors in Practice, Dec. 1990, 1-36. 

This generalized system is shown in Fig. 23, in which the sensor array 2301 interfaces 

30 with a microprocessor 2302 with a serial data port 2302a, which transmits sensor data to a 
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control 2303. The control 2303 further interfaces or includes a data pattern recognition system 
2304 and an interface and programming console 2305 according to the present invention, using 
the aforementioned intelligent features and adaptive pattern recognition techniques. The control 
2203 controls the plant 2306, which includes all the controlled actuators, etc. 

5 

EXAMPLE 9 

ADAPTIVE GRAPHIC INTERFACE 

A "smart screen" aspect according to the present invention is further explored in the 
present example. This aspect of the present invention allows the interface to anticipate or predict 

10 the intent of the user, to provide, as a default user choice, the most likely action to be taken by 
the user of the programmable device as a default, which may be either accepted or rejected by the 
user, without inordinate delay to the user. The intelligent selection feature may also 
automatically choose an option and execute the selected option, without further intervention, in 
cases where little or no harm will result Examples of such harm include a loss of data, a 

15 substantial waste of the user's time and an inappropriate unauthorized allocation of 
computational resources. 

When a user regularly applies the VCR device, for example, to record a particular 
television show which appears weekly on a given television channel, at a given time, on a given 
channel, such an action could be immediately presented to the user as a first option, without 

20 forcing him to explicitly program the entire sequence. Likewise, if the user has already entered 
such a command, the presented choices could include a second most likely selection, as well as 
the possibility of canceling the previously entered command. 

Further, if an entire television programming guide for a week or month is available as a 
database, the interface could actively determine whether the desired show is preempted, a repeat 

25 (e.g., one which has been previously recorded by the system), changed in time or programming 
slot, etc. Thus, the interface could present information to the user, of which he might not be 
aware, and/or predict an action based on that information. Such a device could, if set in a mode 
of operation that allows such, automatically execute a sequence of instructions based on a 
predicted course of action. Thus, if a user is to be absent for a period, he could set the machine 

30 to automatically record a show, even if the recording parameters are not known with precision at 
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the time of setting by the user. Of course, this particular embodiment depends on the availability 
of a database of current broadcast schedules, however, such a database may generally be 
available, e.g., in an on-line database. 

Such an on-line database system of known type may be used and need not be described in 
5 detail herein. Alternately, a printed schedule of broadcasts may be scanned into a computer and 
the printed information deciphered (e.g., OCR) to gain access to a database. Other methods may 
also be used to access scheduling information, e.g. access channels on cable systems, as well as 
other broadcast information identifying future and imminent programming. Together, these 
methods allow semiautonomous operation, guided by programming preferences rather than 

10 explicit programs, where such explicit instruction is absent. 

The smart screens according to the present invention may be implemented as follows. 
The controller may be, for example, an Apple Power Macintosh 8100/110 AV computer, 
operating under Macintosh 7.5 operating system. The HyperCard™ 2.3 software may be used to 
implement the screen interface, which incorporates the above-described features, which is 

15 generally compatible with the Hyperpad software described above. HyperCard™ is mentioned 
due to its capabilities to reference external programs, thus allowing interfacing to various 
software and hardware devices. A more global scripting language, such as Frontier by UserLand 
Software Inc., may also be used, especially where low level hardware control of interfaced 
devices, such as a VCR, multimedia adapter, or the like is desired. Apple Applescript may also 

20 be used. The Quicktime format may be used to store and recall data, however, many acceptable 
formats exist. The input device is an Apple Desktop Bus (ADB) mouse (Apple Computer Inc., 
Cupertino, CA), and the output display is an 8 bit or 24 bit graphics color adapter connected to, 
e.g., a 14" color monitor. In addition, various parameters concerning the use of the interface are 
stored in the computer's memory, and a non-volatile mass storage, device, such as a hard disk 

25 drive, or Electrically Erasable Programmable read Only Memory (EEPROM) or Erasable 
Programmable Read Only Memory (EPROM), as well as battery backed Random Access 
Memmory (RAM) could also be used. 

A more modern implementation might employ, for example, a single or dual Pentium II 
450 MHz workstation, running Microsoft Windows NT 4.0 (or Windows 2000, when available). 

30 The hardware is a matter of choice, including memory, monitor, pointing device, graphic display 
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card, video capture card, mass storage options, and the like. Preferably, a hardware codec. is 
provided, for example a Media 100, Inc. Broadway device. The software may be, for example, 
Microsoft Visual Basic 5.0 or other suitable development language. 

Intel Pentium-based platforms may also be used, preferably in IBM-PC compatible 
5 implementations. Intel 80860 and/or Intel 80960 processor platforms may also be used. 

Alternatively, other Apple Power PC, Macintosh (MC680X0 series) or IBM Power PC 
implementation may be used, providing the advantage of increased processing power over 
Motorola 680X0 derivatives. The specific Power PC employed may be any version, including 
desktop system versions available from Apple and IBM and embedded versions from IBM and 

10 Motorola. These Power PC processors may also be provided in a parallel processing 

implementation. Further, custom implementations of Power PC hardware optimized for the 
relevant computational tasks may be employed. 

Of course, other systems, including DEC Alpha and HP 9000 systems may also be 
employed, as well as SPARC, MIPS, and other available RISC systems. While RISC systems, 

15 possibly supplemented with DSP hardware, are presently preferred because of their efficiency in 
executing the pattern recognition tasks, Complex Instruction Set Computer (CISC)., hybrid and 
other known processing systems may be employed. The Texas Instruments TMS320C80 
combines a Reduced Instruction Set Computer (RISC) processor, Arithmetic logoc Unit (ALU) 
and four DSP processors on a single chip, and is therefore a preferred processor for 

20 implementing various aspects of the system, especially mathematical processing including DCT 
and correlations. 

According to the present invention, the interface may perform comparatively simple 
tasks, such as standard graphic user interface implementation with optimized presentation of 
screen options, or include more complex functionality, such as pattern recognition, pattern 

25 matching and complex user preference correlations. Therefore, hardware requirements will 
range from basic 68040, 80486, Pentium, Power PC, MIPS, SPARC, Digial Equipment Corp. 
(DEC, now Compaq Computer Corp.) Alpha, or other microprocessors which are used to 
perform visual or audio interface functions, to much special purpose processors for 
implementation of complex algorithms, including mathematical, neural network, fuzzy logic, and 

30 iterated function systems (fractals). 
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It should be noted that, while many aspects of the intelligent interface according to the 
present invention do not require extremely high levels of processing power, and therefore may be 
provided with inexpensive and commonly available computing hardware, other aspects involve 
complex pattern recognition and advantageously employ powerful processors to achieve a short 
5 processing latency. Both simple and complex interface systems, however, are included within 
the scope of the present invention. Processing may be distributed in different fashions, so that 
complex functionality may be implemented with relatively simple local hardware, with a 
substantial amount of required processing for a high level of functionality performed centrally, 
and for a large number of users. 

10 From the stored information regarding the prior use of the interface by the user, including 

prior sessions and the immediate session, and a current state of the machine (including a received 
data stream and information relating to the data stream previously stored), a predicted course of 
action or operation may be realized. This predicted operation is, in the context of the current user 
interface state, the most probable next action to be taken by the user. 

15 .The predicted operation is based on: the identity of the user, if more than one user 

operates the interface and machine, the information already entered into the interface during the 
present programming session, the presently available choices for data entry, settings for the use 
of the machine, which may be present as a result of a "setup" operation, settings saved during a . 
prior session, and a database of programming choices. In the case of a HyperCard script, the 

20 interface software calls another program which has access to the necessary data in the memory, 
as well as access to any remote database which may be necessary for implementation of the 
function. Using a predictive technology, such as Boolean logic, fuzzy logic, neural network 
logic, or other type of artificial intelligence, a most probable choice may be presented to the user 
for his approval, or another alternative choice may be selected. Further, a number of most 

25 probable choices may be presented simultaneously or in sequence, in order to improve the 

probability that the user will be immediately or quickly presented with an acceptable choice. If 
multiple choices are presented, and there is limited room on the display, two (or more) similar 
choices may be merged into a single menu selection, which may be resolved in a secondary 
menu screen, e.g. a submenu or dialog box. 
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Figure 24 shows a system for correlating a user's preferences with a prospective or real- 
time occurrence of an event. The input device 2401, which is a remote control with a pointing 
device, such as a trackball, provides the user's input to the control 2402. The program is stored 
in a program memory 2403, after it is entered. The control 2402 controls a plant 2404, which is a 
VCR. The control also controls an on-screen programming interface 2405, through which the 
user interactively enters the program information. Each program entry of the user is submitted to 
the user history database and preferences module 2406, which may also receive explicit 
preference information, input by the user through the input device 2401. The prospective and 
real time event characterization unit 2407 uses any and/or all relevant information available in 
order to determine the character of a signal input, which is a video signal, from the signal 
receiver 2408. A signal analyzer 2409 provides a preliminary analysis and characterization of 
the signal, which is input to the prospective and real time event characterization unit 2407. The 
prospective and real time event characterization unit 2407 also interacts and receives an input 
from a telecommunication module 2410, which in turn interacts and receives information from 
an on-line database 2411. A user preference and event correlator 2412 produces an output 
relating to a relatedness of an event or prospective event and a user preference. In the event of a 
high correlation or relatedness, the control 2402 determines that the event or prospective event is 
a likely or most likely predicted action. The prospective event discussed above refers to a 
scheduled event, which is likely to occur in the future. The characterization unit also has a local 
database 2413 for storing schedule information and the like. 

In the particular context of a videotape, one consideration of the user is the amount of 
time remaining on the tape. Generally, users wish to optimally fill a tape without splitting a 
program, although the optimization and non-splitting parameters may vary between users. 
Therefore, the length of the tape and the amount and character of other items on the tape are also 
factors to be employed in determining a most desired result. With respect to this issue, the 
interface may maintain a library function which allows the identification of a partially filled tape 
for recording under given circumstances. The interface may also optimize a playback by 
selecting a tape containing a desired sequence of materials. 

The intelligent interface may also be used as a part of an educational system, due to its 
ability to adapt to the level of the user and dynamically alter an information presentation based 
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on the "user level", i.e. the training status of the user, and its ability to determine areas of high 
and low performance. Likewise, the intelligent interface according to the present invention may 
also be used in a business environment for use by trained individuals who require relatively static 
software interface design for consistence and "touch typing" with memorized keystroke or mouse 
5 click sequences. In this case, the intelligent functionality is segregated into a separate user 
interface structure, such as an additional "pull down menu" or other available screen location. 
While the interface always monitors user performance, the impact of the analysis of the user is 
selectively applied. User analysis may also be used for performance evaluation according to an 
objective criteria, based on continuous monitoring. In a network environment, user profile and 
10 evaluation may be made portable, stored so as to be accessible from any networked device the 
user may interact with, from office computers to thermostats to photocopying machines to coffee 
machines. 

EXAMPLE 10 

15 INTELLIGENT ADAPTIVE VCR INTERFACE 

A user interacting with the device intends to record a particular program, "Married With 
Children" (Fox, Sunday, 9:00 p.m., etc.) on its ever occurrence. This intent, however, is to 
provide a full library of episodes, and not to duplicate episodes. The particular program is 
subject to the occurrence of reruns, syndicated distribution, time shifting of performance, 

20 preview scenes and advertisements. Further, various actors appearing in the particular program 
also appear in other capacities and roles on television. Therefore, after this intent is elucidated, 
the interface scans available directories of programming to determine when "Marries With 
Children" will be broadcast. In addition, to the extent possible, all channels may be monitored, 
in the event that the directories or erroneous or incomplete. 

25 It is noted that the interface may be quite effective if it is used for a number of 

applications, such as television, radio, desktop computer, and even kitchen and HVAC system. 
For example, preferences for processing MTV or other music video information may be directly 
relevant to processing of radio or other music reproduction devices, and vice versa. 

At some point in the process, preferably prior to substantive programming input, the 

30 interface performs a self-diagnostic check to determine whether the machine is set up and 
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operating correctly. This would include a determination of whether the clock has been set and 
thereafter operating continuously. Of course, the clock could have, in practice, a battery to 
minimize the occurrence of problems relating to clock function. The interface would then, if the 
clock is not properly set, and if there is no telecommunication or other external means for 
5 automatically determining the exact time, present the user with a menu selection to set the proper 
time. Of course, if the correct time is available to the apparatus in some form, this could be 
automatically obtained, and the internal clock updated, without intervention. These same sources 
may be used to verify the accuracy of an internal clock. Further, if a reliable external clock 
system is available, an internal clock may be dispensed with or ignored. Time may also be 
10 inferred based on the regular schedules of broadcasts, e.g., the 11:00 p.m. news begins at 1 1:00 
p.m. If the user does not have access to a source of the exact time, the step of correcting the time 
may be deferred, although at some point the user should be reminded to verify the clock 
information. The user may thus be able to override a machine-generated request or attempt to 
correct the time data. 

15 If the machine has access to an external source of the exact time, it would then preferably 

access this source first. Such sources of exact time include a telephone connection to a voice line 
which repeats the time. The computer would then perform a speech recognition algorithm which 
would be used to determine the time. Such a speech recognition algorithm could also be used as 
a part of the user interface for other purposes, i.e. a speech recognition system is not supplied 

20 solely for obtaining time information. Alternatively, a modem or communication device could 
be used to obtain the time in digitally coded form over a network, which would alleviate the need 
for speech recognition capabilities for this function. An on-line connection could also be used in 
order to obtain information concerning television scheduling. 

A further method for obtaining accurate time information is to access a video signal 

25 which contains the desired time information. For example, many cable broadcasting systems 
have a channel which continuously broadcasts the time in image form. The interface tunes this 
channel, and acquires a representation of the screen image, thereafter performing a character 
recognition algorithm to capture the time information. This character recognition algorithm 
could also be used to obtain or capture information regarding programming schedules, stock 

30 prices, and other text information which may appear on certain cable broadcast channels. 
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Thus, the interface, in obtaining necessary information, employs such available data 
source access methods as speech recognition, character recognition, digital telecommunication 
means, radio wave reception and interpretation, and links to other devices. 

In interacting with the apparatus, the user first identifies himself/herself to the machine, 
5 which can occur in a number of ways. This step may be dispensed with, or at least trivialized, if 
only one user regularly interacts with the apparatus. Otherwise, such identification may be 
important in order to maintain the integrity of the user profiles and predictive aspects of the 
interface. An radio frequency transponder (RF-ID), infrared transponder (IR-ID) system may 
automatically determine the user based on a devices, which may be concealed in a piece of 

10 jewelry or wristwatch. The user may also be identified by voice pattern recognition, speaker 
independent voice recognition, video pattern recognition, fingerprint, retinal scan, or other 
biometric evaluation. An explicit entry of the user identity may also be employed, wherein the 
user types his/her name on a keyboard or selects the name or unique identifier from a "pick-list". 
The interface, upon identifying the user, retrieves information regarding the user, which may 

15 include past history of use, user preferences, user sophistication, patterns of variation of user, 
which may be based on, e.g., time, mood, weather, lighting, biometric factor or other factors. 

Thus, after completing system diagnostics, including the time-check function referred to 
above, the system next determines or predicts the desired function of the user. In this regard, if 
more than one user has access to the system, the user identifies himself to the interface, in a user 

20 identification step 1701 or an analogous action, which may be a coded entry, or a selection from 
the menu. If the interface has voice recognition capability, then the user may be recognized by 
his voice pattern, or merely by stating his name. The interface then accesses the memory for a 
profile of the past use of the machine by the user, which may include the entire prior history, 
relevant abstracts of the history, or derived user preferences, as shown in the personalized startup 

25 based on user profile step 1702, which information is also stored and used in the past user history 
determining element 2107. These choices differ in the amount of storage necessary in order to 
retain the desired information. 

Thus, if the user has only used the VCR to record, e.g., the National Broadcasting 
Company (NBC) 11 o'clock news, i.e., record all days from 11:00 p.m. to 11:30 p.m. on NBC, in 

30 the past, the most likely current predicted choice would be the NBC 1 1 o'clock news. If the 
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interface were to present a number of choices, having lower probability, then it interprets the 
recording history to be "news" based on a database of broadcast information. Therefore, a 
prediction of lower probability would be American Broadcasting Company (ABC) or Central 
Broadcasting Company (CBS) news at, e.g., 11:00 p.m., and the NBC news at, e.g., 5:00 p.m. In 
a cable television system, there may be a number of NBC affiliated news alternatives, so that 
these alternatives may be investigated first before other networks or the like are presented as 
likely choices. In addition, where a video feed is unavailable, a text feed from the internet'or an 
on-line service may be acquired as a probable alternative. 

Thus, a number of likely choices, based on intelligently determined alternatives, as well 
as adaptation based on determined user preferences, are initially presented to the user, along 
with a menu selection to allow rejection of these predicted choices. In this case, the user selects 
the "reject" selection, and the system presents the user with a next predicted desired menu choice. 
Since the user history, in this case, does not provide for another choice of particularly high 
probability, the user is prompted to explicitly choose the program sequence by day, time, 
channel, and duration. The user then enters the starting time for recording according to the 
methods described above. The interface then searches its databases regarding the user and 
broadcast listings to present a most likely choice given that parameter, as well as all available 
alternatives. In this case, the user history is of little help, and is not useful for making a 
prediction. In other cases, the system uses its intelligence to "fill in the blanks", which could, of 
course, be rejected by the user if these are inaccurate or inappropriate. The most likely choices 
are then those programs that begin at the selected time. If the user had input the channel or 
network, instead of starting time, then the presented choices would be the broadcast schedule of 
the channel, e.g. channel 5 or Fox, for the selected day. 

The user then selects one of the available choices, which completes the programming 
sequence. If no database of broadcasts is available, then the user explicitly defines all parameters 
of the broadcast. When the programming is completed, the interface then updates its user 
database, prompts the user to set the VCR to record, by, e.g., inserting a blank or recordable tape. 

If the predicted desire of the user is of no help, or the user seeks to explicitly program the 
system, a manual program entry system is available. Where there is no useful prediction of the 
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user, the interface may request a training session, which may be a general inquiry, or specifically 
directed to immediately forthcoming broadcasts, or both. 

In this case, after a failure to predict a desired program, the user then proceeds to 
explicitly program the VCR interface to record "Married with Children" on Fox at 9:00 p.m. on 
Sunday evening. If a database is available, it might also show that "Married with Children" is 
also syndicated in re-runs, and therefore various episodes may be available on other channels at 
other times. Thus, during the subsequent session, both the premier showing and re-run of 
"Married With Children" would be available predicted choices, along with the 11 o'clock News 
on NBC. 

The user having demonstrated a preference for "Married with Children", the interface 
then characterizes the program. This includes, for example, a characterization of the soundtrack, 
the background, foreground, actors and actresses present, credits, etc. The interface then 
attempts to correlate the features present in the reference selection with other available 
selections. This comparison may be with a preformed database, providing immediate results, or 
prospectively, after entry of the reference selection. Of course, a number of correlation functions 
may proceed simultaneously, and various choices may be merged to form a compound reference 
selection, any ambiguity in which to be later resolved. Further, as various "episodes" of the 
reference selection occur, the system appends and integrates the most recent occurrence with the 
stored reference information, thus updating the reference database. 

When an occurrence is identified, it is immediately buffered, until such time as the 
particular episode may be compared against previously stored episodes. If two identical 
broadcasts occur simultaneously, one may be selected, i.e., the one with the best reception. 
When the episode is identified, if it is new, the buffered broadcast information is permanently 
stored; if it is previously stored, the buffer is flushed and the occurrence is further ignored as a 
"hit". Since the apparatus is now not responding to a direct request, it may then perform various 
housekeeping functions, including updating databases of broadcasts and the like. This is 
because, although the apparatus is preferably highly trained upon manufacture, a large number of 
new broadcasts are always being created and presented, so that the apparatus must constantly 
maintain its "awareness" of data types and trends, as well as update its predicted preferences of 
the user(s). 
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Based on input from the user, other programming including the same actors and/or 
actresses may be processed, e.g., recorded. For example, Katey Segal periodically appears on 
"Jay Leno" as a musical guest, and therefore may be recorded in these appearances. 



5 EXAMPLE 1 1 

INTELLIGENT ADAPTIVE VCR INTERFACE 

Another example of the use of the present programming system allows a hybrid request 
which does not correspond to any single broadcast schedule entry. In this case, if the user 
instead wishes to record weather reports on all channels, the interface may be of further help. 

10 The interface controls a plurality of tuner elements 2502 of a video signal reception device 2501, 
so that a plurality of broadcasts may be simultaneously received. Using the mass storage and 
possibly image data compression described above, a plurality of broadcasts may also be recorded 
simultaneously in the intermediate storage 2503. The mass storage may be multiple VCRs, 
optical storage, magnetooptical storage, magnetic storage including disk (e.g. single disks, 

15 multimedia compatible disks, RAID, etc.) tape (QIC, 8mm, 4mm, etc.). Preferably, the archival 
recording medium is recordable DVD or possibly recordable CD-ROM. 

The optical recording tape produced by ICI, Inc., or other card or tape optical storage 
medium might also be a useful storage medium for large volumes of data, as might be generated 
by recording multiple video signals. The known implementations of the ICI product system best 

20 suited for commercial or industrial use and not for individual consumer use. 

In any case, the interface 2506 accesses its associated database 2413 to determine, at a 
given time, which channels are broadcasting "news". The interface system might also randomly 
or systematically monitor or scan all or a portion of the available broadcasts for "special reports". 
The interface system then monitors these channels for indicia of a "weather" information content 

25 broadcast. For example, the newscaster who appears to report the weather on a given show is 
usually the same, so that a pattern recognition system 2505 of the video frame could indicate the 
presence of that newscaster. In addition, the satellite photographs, weather radar, computer 
generated weather forecast screens, etc. are often similar for each broadcast. Finally, news 
segments, such as "weather" often appear at the same relative time in the broadcast. Using this 

30 information, the interface system selects certain broadcast segments for retention. 
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This retention begins at a beginning of a news segment, such as "weather", stop recording 
during commercials, and continues after return from break, on all selected channels. In order to 
assist in making accurate decisions, the monitored broadcasts may be stored in a temporary 
storage medium until a decision is made, and thereafter transfer the recording to a more 
5 permanent storage medium if that be appropriate. It is noted that the system of the present 
invention is intelligent, and may therefore "learn" either explicitly, or through training by 
example. Therefore, if the system made an error during the process, the user may define the error 
of the system, e.g., a substitute newscaster or rearrangement of news segments, so that the 
interface system has a reduced likelihood of making the same error again. Thus, while such a 

10 system is inherently complex, it poses significant user advantages. Further, while the interface 
system itself is sophisticated, it provides simplicity, with inductive reasoning and deductive 
reasoning for the user. 

Thus, a minimum of user interaction is required even for complex tasks, and nearly full 
automation is possible, as long as the user and apparatus are able to communicate to convey a 

15 preference. As a further embodiment according to the present invention, the interface system 
will stored transmitted data, and subsequently review that data, extracting pertinent information. 
The stored data may then be deleted from the storage medium. In this regard, the system may be 
self learning,. 

It is noted that various algorithms and formulae for pattern recognition, correlation, data 
20 compression, transforms, etc., are known to those skilled in the art, and are available in 
compendiums, such as Netravali, Arun N., and Haskell, Barry G., "Digital Pictures 
Representation and Compression", Plenum Press, New York (1988); Baxes, Gregory A., "Digital 
Signal Processing, A Practical Primer", Prentice-Hall, Englewood Cliffs, N.J. (1984); Gonzalez, 
Rafael C, "Digital Image Processing", Addison- Wesley, Reading, MA (1987), and, of a more 
25 general nature, Press, William H. et al, "Numerical Recipes in C The Art of Scientific 
Computing", Cambridge University Press, 1988. 

EXAMPLE 12 

INTELLIGENT ADAPTIVE VCR INTERFACE 
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A further example of the use of the advanced intelligent features of the present invention 
is the use of the system to record, e.g., "live" musical performances. These occur on many "talk" 
shows, such as "Tonight Show" (NBC, 11:30 p.m. to 12:30 p.m., weeknights), "Saturday Night 
Live" (NBC 11:30 p.m. to 1:00 a.m. Saturday-Sunday), and other shows or "specials" such as the 
5 "Grammy Awards". The interface, if requested by the user to record such performances, then 
seeks to determine their occurrence by, e.g., analyzing a broadcast schedule; interacting with the 
on-line database 2411; and by reference to the local database 2413. When the interface 
determines with high probability that a broadcast will occur, it then monitors the channel (s) at 
the indicated time(s), through the plurality of tuners 2502. The system may also autonomously 

10 scan broadcasts for unexpected occurrences. 

In the case of pay-per-view systems and the like, which incorporate encrypted signals, an 
encryption/decryption unit 2509 is provided for decrypting the transmitted signal for analysis and 
viewing. This unit also preferably allows encryption of material in other modes of operation, 
although known decryption systems without this feature may also be employed with the present 

15 system. During the monitoring, the interface system acquires the audio and video information 
being broadcast, through the signal receiver 2408, and correlates this information with a known 
profile of a "live musical performance", in the preference and event correlator 2412. This must 
be distinguished from music as a part of, e.g., a soundtrack, as well as "musicals" which are part 
of movies and recorded operas, if these are not desired by the user. Further, music videos may 

20 also be undesirable. When the correlation is high between the broadcast and a reference profile 
of a "live musical performance", the system selects the broadcast for retention. In this case, the 
information in the intermediate storage 2503 is transferred to the plant 2507, which includes a 
permanent storage device 2508. The intermediate storage 2503 medium is used to record a 
"buffer" segment, so that none of the broadcast is lost while the system determines the nature of 

25 the broadcast. This, of course, allows an extended period for the determination of the type of 
broadcast, so that, while real-time recognition is preferred, it is not absolutely necessary in order 
to gain the advantages of the present invention. The buffer storage data, if not deleted, also 
allows a user to select a portion for retention that the interface system has rejected. 

Thus, while it is preferable to make a determination in real time, or at least maintain real 

30 time throughput with a processing latency, it is possible to make an ex post facto determination 
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of the nature of the broadcast program. By using an available delay, e.g., about 5 to about 300 
seconds, or longer, the reliability of the determination can be greatly increased as compared to an 
analysis of a few frames of video data, e.g., about 15 to about 300 mS. An intermediate 
reliability will be obtained with a delay of between about 300 to about 5000 mS. As stated 
5 above, the storage system for this determination need not be uncompressed nor lossless, so long 
as features necessary to determine the character of the broadcast are present. However, it is 
preferred that for broadcast recording intended for later viewing, the storage be as accurate as 
possible, so that if a compression algorithm is implemented, it be as lossless as reasonable given 
the various constraints. The MPEG-2 standard would be applicable for this purpose, though 

10 other video compression systems are available. 

In a preferred situation, approximately 5 minutes of broadcast material is analyzed in 
order to make a determination of the content. This broadcast material is stored in two media. 
First, it is stored in a format acceptable for viewing, such as video tape in a videotape recorder, or 
in digital video format, e.g., uncompressed, MPEG-2. Second, it is received in parallel by the 

15 computer control, where the data is subject to a number of recognition and characterization 
processes. These are performed in parallel and in series, to produce a stored extracted feature 
matrix. This matrix may contain any type of information related to the broadcast material, 
including an uncompressed signal, a compressed signal, a highly processed signal relating to 
information contained in particular frames and abstract features, spatially and temporally 

20 dissociated from the broadcast signal, yet including features included in the broadcast which 
relate to the content of the broadcast. 

One possible method incorporates one or more digital signal processor based coprocessor 
elements, which may be present on, e.g., Nubus cards in the Macintosh Quadra 950, Apple 
Power PC, PCI card in Pentium-based MS-DOS/Windows 3.1, 3.11, 95, 98, NT computers (or 

25 Macintosh PCI-based computers), other Power PC based computers. These elements may be 
based on C-Cube CL550 (JPEG compression), Analog Devices ADSP-21020, Analog Devices 
ADSP-21060, AT&T (formerly American Telephone and Telegraph Co.) DSP32C, AT&T 
DSP3210, AMD 29000 series, Motorola DSP 96000ADS, Texas Instruments TMS 320C40, 
TMS 320C80, IBM Mwave, or other known devices. Other devices are also available from 

30 Analog Devices, AT&T, DSP Group, Motorola, NEC, SGS-Thomson, Sharp, Texas Instruments, 
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Zilog, Zoran, and other vendors. See, EDN, May 11, 1995, pp. 40-106; Bursky, D., "Improved 
DSP ICs Eye New Horizons", Electronic Design, November 11, 1993, pp. 69-82. DSP systems, 
which generally have an architecture optimized for the efficient and rapid execution of repetitive 
numeric calculations, are desirable for certain pattern recognition tasks, and may be provided as a 
5 tightly coupled parallel processing array to increase throughput. 

A known board containing a DSP is the MacDSP3210 by Spectral Innovations Inc., 
containing an AT&T digital signal processor and an MC68020 CISC processor, and which uses 
the Apple Real-time Operating System Executive (A/ROSE) and Visible Cache Operating 
System (VCOS). It is preferred that the processors employed be optimized for image processing, 

10 because of their higher throughput in the present image processing applications, to process the 
video signals, and more other signal processors to analyze the audio signals. Of course, general 
purpose processors may be used to perform all calculations. An array processor which may be 
interfaced with a Macintosh is the Superserver-C available from Pacific Parallel Research Inc., 
incorporating parallel Inmos Transputers. Such an array processor may be suitable for parallel 

15 analysis of the image segment and classification of its attributes. 

Pattern recognition processing, especially after preprocessing of the data signal by digital 
signal processors and image compression engines, may also be assisted by logical inference 
engines, such as FUTURE (Fuzzy Information Processing Turbo Engine) by The Laboratory for 
International Fuzzy Engineering (LIFE), which incorporates multiple Fuzzy Set Processors 

20 (FSP), which are single-instruction, multiple data path (SIMD) processors. Using a fuzzy logic 
paradigm, the processing system may provide a best fit output to a set of inputs more efficiently 
than standard computational techniques, and since the presently desired result requires a "best 
guess", rather than a very accurate determination, the present interface is an appropriate 
application of this technology. 

25 As noted above, these processors may also serve other functions such as voice 

recognition for the interface, or extracting text from video transmissions and interpreting it. It is 
also noted that, while some of these coprocessing engines are now costly, these costs are 
decreasing and the present invention therefore includes the use of sophisticated present designs 
as well as future devices which may be used to perform the stated functions. The continued 

30 development of optical computers may also dramatically reduce the cost of implementing this 
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aspect of the present invention; however, the present state of the art allows the basic functions to 
be performed. See attached appendix of references, incorporated herein by reference, detailing 
various optical computing designs. 

A real time operating system may be employed, of which there are a number of available 
examples. Some older examples include SPOX DSP operating system, IBM's Mwave operating 
system and AT&T's VCOS operating system. These operating systems, and possibly others, are 
to be supported by Microsoft Inc.'s Windows 95 operating system Resource Manager function. 

It is noted that various methods are available for determining a relatedness of two sets of 
data, such as an image or a representation of an image. These include the determination of 
Hausdorff distance, fuzzy correlation, arithmetic correlation, mean square error, neural network 
"energy" minimization, covariance, cross correlation, and other known methods, which may be 
applied to the raw data or after a transformation process, such as an Aff ine transformation, a 
Fourier transformation, a Gabor transformation, a warping transformation, a color map 
transformation, and the like. Further, it is emphasized that, in image or pattern recognition 
systems, there is no need that the entire image be correlated or even analyzed, nor. that any 
correlation be based on the entirety of that image analyzed. Further, it is advantageous to allow 
redundancy, so that it is not necessary to have unique designations for the various aspects of the 
data to be recognized, nor the patterns to be. identified as matching the uncharacterized input 
data. 

The MSHELL from Applied Coherent Technology is a software system that runs on a 
Mercury MC3200 array processor, in conjunction with a Data Translation DT2861 or DT2862. 
The NDS1000 Development System from Nestor, Inc., provides image recognition software 
which runs on a PC compatible computer and a Data Translation DT2878. 

The C-Cube CL550 is disclosed in "C-Cube CL550 JPEG Image Compression 
Processor", Preliminary Data Book, August 1991, and addendum dated November 20, 1991, and 
products incorporating the CL550 include the JPEG Video Development Kit (ISA bus card with 
Chips and Technologies PC video 82C9001 A Video Window Controller), and the C-Cube 
CL550 Development Board/PC for ISA Bus (CL550, for use with Truevision TARGA-16 or 
ATVista cards) or for NuBus (Macintosh). The so-called C-Cube "CL950" is a MPEG decoder 
device. Such a device as the CL950 may be particularly useful for use in the present VCR for 
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reproducing compressed program material, which may be compressed by the present apparatus, 
or may be used for decompressing pre-compressed program material. Other MPEG-1 and 
MPEG-2 encoding and decoding devices are known. 

It is noted that all functions of a VCR would also be facilitated by the use of such 
5 powerful processors, and thus it is not only these advanced functions which are enabled by these 
advanced processors and coprocessors. It is also noted that these image recognition functions 
need not necessarily all be executed local to the user, and may in fact be centralized with 
resultant processed data transmitted to the remote user. This would be advantageous for two 
reasons: first, the user need not have an entire system of hardware localized in the VCR, and 
10 second, many of the operations which must be performed are common to a number of users, so 
that there is a net efficiency to be gained. 



EXAMPLE 13 

INTELLIGENT ADAPTIVE VCR INTERFACE 

15 The interface of the present invention incorporates an intelligent user interface level 

determination. This function analyzes the quality of the user input, rather than its content. Thus, 
this differs from the normal interface user level determination which requires an explicit entry of 
the desired user level, which is maintained throughout the interface until explicitly changed. The 
present interface may incorporate the "smart screen" feature discussed above, which may, 

20 through its analysis of the past user interaction with the interface predict the most likely 
predicted user input function. Thus, the predictive aspects of the present invention may be 
considered a related concept to the intelligent user level interface of the present invention. 
However, the following better serves to define this aspect of the invention. 

The input device, in addition to defining a desired command, also provides certain 

25 information about the user which has heretofore been generally ignored or intentionally removed. 
With respect to a two-dimensional input device, such as a mouse, trackball, joystick, etc., this 
information includes a velocity component, an efficiency of input, an accuracy of input, an 
interruption of input, and a high frequency component of input. This system is shown 
schematically in Fig. 21, which has a speed detector 2104, a path optimization detector 2105, a 

30 selection quality detector 2106, a current programming status 2108, an error counter 2109, a 
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cancel counter 2110, a high frequency signal component detector 21 12 ? an accuracy detector 
2113 and a physio-dynamic optimization detector 2114. In addition, Fig. 21 also shows that the 
interface also uses a past user history 2107, an explicit user level choice 2111 and an explicit 
help request 2115. 

5 This list is not exclusive, and is somewhat dependent on the characteristics of the specific 

input device. For a mouse, trackball, or other like device, the velocity or speed component refers 
to the speed of movement of the sensing element, i.e. the rotating ball. This may also be 
direction sensitive, i.e., velocity vector. It is inferred that, all other things being equal, the higher 
the velocity, the more likely that the user "knows" what he is doing. 

10 The efficiency of input refers to two aspects of the user interface. First, it refers to the 

selection of that choice which most simply leads to the selection of the desired selection. For 
example, if "noon" is an available choice along with direct entry of numbers, then the selection 
of "noon" instead of "12:00 p.m." would be more efficient. The second aspect of efficiency has 
to do with the path taken by the user in moving a graphic user interface cursor or input device 

15 from a current position to a desired position. For example, a random curve or s wiggle between 
locations is less efficient than a straight line. This effect is limited, and must be analyzed in 
conjunction with the amount of time it takes to move from one location of a cursor on the screen 
to another; if the speed of movement is very rapid, i.e. less than about 400 mS for a full screen 
length movement, or less than about 300 mS for small movements, then an inefficiency in path is 

20 likely due to the momentum of the mouse and hand, momentum of the rolling ball, or a 

physiological arc of a joint. This aspect is detected by the physio-dynamic optimization detector 
2114. Thus, only if the movement is slow, deliberate, and inefficient, should this factor weigh 
heavily. It is noted that arcs of movement, as well as uncritical damping of movement around 
the terminal position may be more efficient, and a straight path actually inefficient, so that the 

25 interface may therefore calculate efficiency based on a complex determination, and act 
accordingly where indicated. 

Thus, an "efficient" movement would indicate an user who may work at a high level, and 
conversely, an inefficient movement would indicate a user who should be presented with simpler 
choices. The efficiency of movement is distinguished from gestures and path dependent inputs, 

30 such as drawing and painting. These may be distinguished based on machine status or context. 
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Further, the interface may recognize gestures in may contexts. Therefore, gesticulations must be 
distinguished from command inputs before further processing. Gesticulations, like path 
efficiency, may also be analyzed separately from the basic command input, and therefore may be 
provided as a separate input stream on an interface level rather than an application level, thus 
5 allowing cross application operation. 

Likewise, if a movement is abrupt or interrupted, yet follows an efficient path, this would 
indicate a probable need for a lower user interface level. This would be detected in a number of 
elements shown in Fig. 21, the speed detector 2104, a high frequency signal component detector 
2112, an accuracy detector 2113 and a physio-dynamic optimization detector 2114. In addition, 

10 Fig. 21 also shows the use of a past user history 2107, an explicit user level choice 2111 and an 
explicit help request 2115. 

While the interface may incorporate screen buttons which are smart, i.e. those which 
intelligently resolve ambiguous end locations, the accuracy of the endpoint is another factor in 
determining the probable level of the user. Thus, for example, if a 14" color monitor screen is 

15 used, having a resolution of 640 by 480 pixels, an accurate endpoint location would be within a 
central area of a screen button of size about 0.3" by about 1.0", would be an area of about 0.25" 
by about 0.75". A cursor location outside this location, but inside the screen button confines 
would indicate an average user, while a cursor location outside the screen button may be inferred 
to indicate the button, with an indication that the user is less experienced in using the pointing 

20 device. 

Finally, in addition to the efficiency of the path of the cursor pointing device, a high 
frequency component may be extracted from the pointer signal by the high frequency signal 
component detector 2112, which would indicate a physical infirmity of the user (tremor), a 
distraction in using the interface, indecision in use, or environmental disturbance such as 

25 vibration. In this case, the presence of a large amount of high frequency signal indicates that, at 
least, the cursor movement is likely to be inaccurate, and possibly that the user desires a lower 
user level. While this is ambiguous based on the high frequency signal content alone, in 
conjunction with the other indicia, it may be interpreted. If, for example, the jitter is due to 
environmental vibrations, and the user is actually a high level user, then the response of the user 

30 level adjust system would be to provide a screen display with a lowered required accuracy of 
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cursor placement, without necessarily qualitatively reducing the implied user level of the 
presented choices, thus, it would have an impact on the display simplification 2103, with only 
the necessary changes in the current user level 2101. 

Alternatively, the user may input a gesture, i.e., a stylized input having no other 
5 command input meaning, which may be detected by analyzing the input. The input may be a 
manual input, voice, image or the like. A number of different gestures may be recognized. 
These gestures are generally explicit inputs, which alloiv a voluntary action to be interpreted as 
input information to the interface. 

10 EXAMPLE 14 

INTELLIGENT TELEPHONE DEVICE INTERFACE 

Likewise, the present interface could be used to control complex telecommunications 
functions of advanced telephone and telecommunications equipment. In such a case, the user 
display interface would be a video display, or a flat panel display, such as an LCD display. The 
15 interface would hierarchically present .the available choices to the user, based on a probability of 
selection by the user. The input device would be, for example, a small track ball near the 
keypad. Thus, simple telephone dialing would not be substantially impeded, while complex 
functions, such as call diversion, automated teledictation control, complex conferencing, caller 
identification-database interaction, and videotel systems, could easily be performed. 

20 

EXAMPLE 16 

CHARACTER RECOGNITION OF VIDEO 

The present invention may incorporate character recognition from the video broadcast for 
automatic entry of this information. This is shown schematically in Fig. 24, with the inclusion of 
25 the videotext and character recognition module 2414. This information is shown to be 

transmitted to the event characterization unit 2407, where the detected information is correlated 
with the other available information. This information may also be returned to the control 2402. 
Examples of the types of information which would be recognized are titles of shows, cast and 
crew from programming material, broadcast special alerts, time (from digital display on special 
30 access channels), stock prices from "ticker tape" on special access channels, etc. Thus, this 
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technology adds functionality to the interface. In addition, subtitled presentations could be 
recognized and presented through a voice synthesizer, to avoid the necessity of reading the 
subtitle. Further, foreign language subtitles could be translated into, e.g., English, and presented. 
In a particular embodiment, certain game shows, such as "Wheel of Fortune" have alphanumeric 
5 data presented as a part of the programming. This alphanumeric text may be extracted from the 
image. 

In a preferred embodiment, the character recognition is performed in known manner on a 
buffer memory containing a frame of video, from a device such as a Data Translation DT2851, 
DT2853, DT2855, DT2867, DT2861, DT2862 and DT2871. A contrast algorithm, run on, for 

10 example, a Data Translation DT2858, DT2868, or DT2878, first removes the background, 

leaving the characters. This works especially well where the characters are of a single color, e.g. 
white, so that all other colors are masked. After the "layer" containing the information to be 
recognized is masked, an algorithm similar to that used for optical character recognition (OCR) 
is employed. See, U.S. 5,262,860, incorporated herein by reference. These methods are well 

15 known in the art. This may be specially tuned to the resolution of the video device, e.g. NTSC, 
Super Video Home System (S-VHS), High Definition Television and/or Advannced Television 
System Committee (HDTV/ATSC-various included formats), Improved definition television 
(IDTV), Enhanced Definition Television (EDTV), Multiple Sideband Encoding (MUSE), Phase 
Alternate Line (PAL), Sequential Coleur a Memoire (SEC AM), MPEG-2 digital video, or other 

20 analog or digital transmission and/or storage formats, etc. In addition, since the text normally 
lasts for a period in excess of one frame, a spatial-temporal image enhancement algorithm may 
be employed to improve the quality of the information to be recognized, if it is indistinct in a 
single frame. 

25 EXAMPLE 17 

SMART HOUSE INTERFACE 

The present invention may also be incorporated into other types of programmable 
controls, for example those necessary or otherwise used in the control of a smart house. See, 
"The Smart House: Human Factors in Home Automation", Human Factors in Practice, Dec. 
30 1990, 1-36. The user interface in such a system is very important, because it must present the 
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relevant data to the user for programming the control to perform the desired function. A smart 
house would likely have many rarely used functions, so that both the data and the available 
program options must be presented in the simplest manner consistent with the goal of allowing 
the user to make the desired program choice. For example, a smart house system with 
5 appropriate sensors might be used to execute the program: "start dishwasher, if more than half 
full, at 9:00 p.m." This program might also include a program to load soap into the dishwasher 
or to check if soap is already loaded. A user who wishes to delay starting until 1 1:00 p.m. would 
be initially presented with the defaults, including start time as an option, which would be simply 
modified by correcting the starting time. The next time the same user wishes to program the 

10 device, an algorithm might change the predicted starting time to, e.g. 10:00 p.m., which is a 

compromise between the historical choices. Alternatively, the new predicted start time might be 
11:00 p.m., the last actually programmed sequence. Finally, the next predicted start time might 
remain at 9:00 p.m. The resolution of these choices would depend on a number of factors: a 
preprogrammed expert system; any other prior history of the user, even with respect to other 

15 appliances or in other situations; the context, meaning any other contemporaneously programmed 
sequences; and an explicit input from the user as to how the inputs should be evaluated for 
predictive purposes. 

The expert system would balance factors, including disturbing noise from the dishwasher, 
which might be objectionable while persons are near the dishwasher, people are sleeping, or 

20 during formal entertainment. On the other hand, if the dishwasher is full, or its cleaned contents 
are needed, the dishwasher should run. Some persons prefer to reshelve dishes in the evening, 
before sleep, so in those cases, the dishwasher should complete its cycle before bedtime. The 
dishwasher, on a hot water cycle, should not run during showers or baths, and preferably should 
not compete with a clothes washer for hot water. The dishwasher preferably does not run during 

25 peak electrical demand times, especially if electrical rates are higher. Water conserving cycles 
should be selected, especially during droughts or water emergencies. If dishes remain in the 
dishwasher for an extended period, e.g., overnight, a moistening cycle may be employed to help 
loosen dirt and to help prevent drying. Thus, the expert system is preprogrammed for a number 
of high level considerations that might be common to a large number of users of the system, thus 

30 shortening the required training time of the system to learn the preferences of the user. Such a 
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sophisticated system may eliminate the need entirely for adaptive responses, based on weighing 
of considerations provided by the user. Of course, other considerations may also be included for 
the operation or delay of operation of the dishwasher. Further, these considerations are 
exemplary of the types of considerations which might be employed in an expert system in a 
5 smart house. 

The prior history of the user provides an excellent source of information regarding the 
preferences of the user, although this is sometimes not the most efficient means, and may often 
include contradictory data. This historical use data is therefore analyzed in a broad context in 
order to extract trends, which over a number of uses may be further extracted as "rules". Often, 

10 the user history data will be applied at a high level, and will interact with preexisting rules of the 
expert system, rather than to create new rules. In this case, the expert system preferably includes 
a large number of "extra rules", i.e., those with an a priori low probability or low weighing, 
providing a template for future pattern matching. The past history may be evaluated in a number 
of ways. First, an expert system may be used to analyze the past usage pattern. Second, a neural 

15 network may be trained using the historical data along with any corrective feedback. Third, the 
historical data may be used to alter fuzzy logic rules or classifications, either by expert system, 
neural network, or by other known means. 

The context of use may also be used to determine a desired or predicted action. 
Therefore, if on a single occasion, a number of changes are made, for example during a large 

20 house party, the standard predictions would not be altered, and thus a normal program would 
remain in effect. Of course, a new "house party" sequence would then be recognized and 
included as a new type of sequence for future evaluation. For example, a house party sequence 
might encompass a number of house systems. Thus, the delay of dishwasher until 1 1 :00 p.m. 
allows all dishes from the party to be placed in the dishwasher before starting. An alarm system 

25 would be generally deactivated, although various zones may be provided with different 

protection; e.g., a master suite may be off-limits, with an alarm transmitting a signal to a user's 
beeper, rather than a call to police or alarm service company. During the summer, the air 
conditioner might run even if doors and windows are open, even if the normal program prompts 
for door closings before the air conditioner is turned on. Likewise, exterior lighting would be 

30 turned on at dusk, with bug lights turned on during the entire party. The user might individually 
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make such decisions, which would be recognized as a group due to their proximity in time, or 
delineate the actions as a group. Thereafter, where some of these choices are made, and the 
profile of choices matches a "party" style, the remainder of the choices may be presented as a 
most likely or predicted choice. The group of choices together might also be selected from a 
menu of choices. 

Context also relates to sensor data, which might include sensors in particular appliances 
or unrelated sensors. For example, infrared motion detectors may be used to estimate the number 
of persons present in a house. Likewise, heavy use of a bathroom, as detected by flushes, 
frequent light transitions or door openings, might also be useful as data to estimate a crowd size. 
Temperature sensors, video imaging sensors, perimeter sensors, electrical sensors relating to the 
status of appliances and machinery, and other types of sensors may provide data for context 
determination. 

Of course, explicit inputs must also be accommodated, which may be atomic instructions 
or complex combinations of instructions which may control a single house system or a number of 
house systems simultaneously. The explicit input preferably comes by way of the adaptive 
interface described throughout the present application, or an interface incorporating particular 
aspects thereof. 

The smart house system also controls the climate control system. Thus, it could 
coordinate temperatures, air flow and other factors, based on learned complex behaviors, such as 
individual movement within the dwelling. Since the goal of the programming of the smart house 
is not based on the storage of discrete information, but rather the execution of control sequences 
at various times and under certain circumstances, the control would differ in various ways from 
that of a VCR. However, the user interface system, adaptive user level, help system, and the like 
would be common to both types of system. This differs from the Fuzzy Logic controlled air 
conditioner available (in Japan) from Mitsubishi in that these prior art devices do not have an 
intelligent interface of the present invention. It should also be noted that the control for the VCR 
could be the same control as that for the smart house, so that the common elements are not 
redundant. Therefore, by applying a single control to many tasks, a common user interface is 
used, and the cost is reduced. 
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EXAMPLE 18 

PROGRAMMABLE ENVIRONMENTAL CONTROLLER 

The present Example relates to a programmable environmental controller application. In 
this case, a sensor or sensor array is arranged to detect a change in the environment which is 
related to a climatic condition, such as an open door. On the occurrence of the door opening, the 
system would apply a pattern recognition analysis to recognize this particular sensor pattern, i.e. 
a mass of air at a different temperature entering the environment from a single location, or a loss 
of climate controlled air to a single location. These sensor patterns must be distinguished from 
other events, such as the action of appliances, movement of individuals in the vicinity of the 
sensor, a shower and other such events. It is noted that in this instance, a neural network based 
adaptive controller may be more efficient than a standard fuzzy logic system, because the 
installation and design of such a system is custom, and therefore it would be difficult to program 
fuzzy set associations a priori. In this case, a learning system, such as a neural network, may be 
more efficient in operation and produce a better result than other adaptive methods. The training 
procedure may be fully automated, (with manual feedback provided where necessary to adjust 
the control parameters) so long as sufficient sensors are provided for controlling the system, and 
also that an initial presumption of the control strategy is workable during the training period. In 
the case of an HVAC system, the initial strategy incorporated is the prior art "bang-bang" 
controller, which operates as a simple thermostat, or multi-zone thermostat. As a better starting 
point, a fuzzy logic temperature controller may be modeled and employed. Other known 
strategies which are not often used in environmental control include the proportional-integral- 
differential controller (PID). 

It is noted that the HVAC system may also be of a type which is inoperable with standard 
type controllers; for example, the system may be such as to produce temperature oscillations, or 
significant temperature or pressure gradients. In this case, the default control system must be 
provided to compensate the system,, allowing more subtle corrections and adjustments to be 
made based on preferences. Thus, an expert system is provided, which is updated based on user 
input, and which receives context information, including sensor data and other inputs. Explicit 
user preferences and programming are also input, preferably with an interface in accordance with 
the present invention or incorporating aspects thereof. 
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In this example, which may be described with reference to Fig. 23, sufficient sensors in a 
sensor array 2301 are provided, being light, temperature, humidity, pressure, air flow and 
possibly a sensor for determining an event proximate to the sensor, such as door opening. While 
a single sensor array 2301 provides input to the present control, a plurality of sensor arrays are 
preferably employed in complex installations, such as that described here. The sensors, with the 
possible exceptions of the flow sensor and event sensor, are housed in a single sensor head. 
Further, the temperature and pressure sensors may be combined in a single integrated circuit by 
known means. The light and temperature sensors are known to those skilled in the art, and need 
not be described herein. The pressure sensor may be a Sensym strain gage pressure transducer, a 
Motorola pressure transducer device, or the like, which are known in the art. Alternatively, other 
types of sensors may be used, for example a micromachined silicon force balance pressure 
transducer, similar in electrical design to the Analog Devices monolithic accelerometers, ADXL- 
50 or ADXL-05. 

The humidity sensor is preferably an electronic type, producing an electrical signal 
output. It need not be internally compensated for the other measured environmental factors, as 
the constellation of sensors may compensate each other. The air flow sensor may be based on 
pressure differentials, using the electronic pressure sensor described above, or may be a 
mechanical vane type, which is based on flows. In most applications, a single flow axis will be. 
sufficient, however, in some circumstances, a two or greater axis sensor will be required. 
Further, in the case of large volume areas, complex turbulent flow patterns may be relevant, for 
which known sensors exist. Laser based air flow sensors may be employed, if desired. LIDAR 
sensors may be used to determine flow rate, direction, and turbulence. 

The event sensor may be of any type, and depends particularly on the event being 
measured. In the present case, where a door opening is to be detected, it is preferred that the 
environmental control be interfaced with a perimeter intrusion alarm system, which, for example, 
provides a magnet embedded in the door and a magnetic reed switch in the door frame. 
Individual sensors are normally wired to the alarm control panel, thus providing central access to 
many or all of the desired event detection sensors while minimizing the added cost. The event 
detector may also be an ultrasonic, infrared, microwave-Doppler, mechanical, or other type of 
sensor. Wireless sensors may also be used, communicating via infrared beams, acoustic, radio 
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frequency, e.g., 46-49 MHz, 900 MHz, or other bands, using analog, digital or multilevel 
quantized digital AM, FM, PSK, QAM, or other modulation scheme, or a combination thereof. 
Spread spectrum devices may be employed, as well as time, code or frequency multiplexing or a 
combination thereof. Various failsafe mechanisms are preferably included, including those 
5 identifying transmitter or receiver failure, communication interference or message collision, and 
other conditions. A reverse communication channel may also be included, either symmetric in 
band, or asymmetric in band or out of band, for communication with the sensor or apparatus 
associated with the sensor, and as part of the failsafe system. A forward error correction protocol 
is preferably effected, which may detect errors and include error correcting codes for digital 
10 transmissions. Digital data may be encrypted, and the transmission modulation scheme may also 
include an encrypted sequence of frequency, phase, convolution, noise, or other modulation 
parameter. 

While wireless data transmission as described above may be used, the preferred method 
of receiving sensor information is through a serial digital or analog (i.e., 4-20 mA transmitter) 

15 data transmission which may be multiplexed and/or part of a local area network scheme, with 
minimal local processing of the sensor data by the microprocessor 2302 with the serial link 
2302a in the sensor head. Such serial digital protocols and physical transport layers include 
Echelon LON-works, BSR X-10, CEBUS, RS-232, RS-423, Apple ADB, Appletalk, Ethernet 
(10 base T, 10 Base 2, 10 base 5, 100 Base T, 100 base VG), ATM, USB, IEEE-1394, Homerun 

20 (Intel/Tut), etc. This system allows the central control 2303 to incorporate the desired 

processing, e.g., by the pattern recognition system 2304, etc., while minimizing the installation 
expense. A simple microprocessor device 2302 in the sensor head interfaces the sensing 
elements, and may provide analog-to-digital conversion, or other conversion which may be 
necessary, of the sensor signal. In the case of a serial digital data transmission, the local 

25 microprocessor formats the sensor data, including a code indicating the sensor serial number and 
type, the sensor status (i.e., operative, defective, in need of maintenance or calibration, etc.), the 
sensor data, and an error correcting code. In the case that the data is transmitted on a local area 
network, the microprocessor also arbitrates for bus usage and the messaging protocol. 

The control, it must be understood, has a number of available operative systems at its 

30 disposal, comprising the plant 2306. In this case, the system is a forced air heating and cooling 
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system. This system has a heating unit, a humidifier, blowers, a cooling unit (which also 
dehumidifies), ducts, dampers, and possible control over various elements, such as automated 
door openers. 

As described above, the system is installed with a complete array of sensors, some of 
5 which may be shared with, or a part of, other control systems in the environment, and begins 

operation with a basic acceptable initial control protocol. The system then receives data from the 
sensors, and correlates data from the various sensors, including the event sensors, with the 
operation of the systems being controlled. In such a case, a "door open" event may be correlated 
with a change in other measured variables. The system then correlates the control status with the 

10 effect on the interrelation of the measured variables. Thus, the system would detect that if the 
blower is operating while the door is open, then there is a high correlation that air will flow out 
of the door, unless a blower operates to recirculate air from a return near the door. Thus, the 
system will larn to operate the proximate return device while the door is open and the blower is 
on. Once this correlation is defined, the system may further interrelate the variables, such as a 

15 wind speed and direction outside the door, effects of other events such as other open doors, the 
absolute and relative speeds of the blowers and the return device, the effect of various damper 
devices, etc. It is further noted that, under some circumstances, an exchange of air through an 
open door is desired, and in such instance, the system may operate to facilitate the flow through 
such an open door. Finally, the system must be able to "learn" that conditions may exist which 

20 produce similar sensor patterns which should be handled differently. An example is a broken, 
defective or inoperative sensor. In such a case, the system must be able to distinguish the type of 
condition, and not execute an aggressive control algorithm in an attempt to compensate for an 
erroneous reading or otherwise normal event. This requires the intelligent control of the present 
invention. In order to distinguish various events, sensors which provide overlapping or 

25 redundant information, as well as providing a full contextual overview, should be provided as a 
part of the system. 

It is further noted that energy efficiency is a critical issue in climate control systems, and 
an absolute and continuous control over the internal environment may be very inefficient. Thus, 
the starting of large electrical motors may cause a large power draw, and simultaneous starting of 
30 such equipment may increase the peak power draw of a facility, causing a possible increase in the 
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utility rates. Further, some facilities may operate on emergency or private power generation (co- 
generation) which may have different characteristics and efficiency criteria. These factors may 
all be considered in the intelligent control. It is also noted that a higher efficiency may also be 
achieved, in certain circumstances, by employing auxiliary elements of the climate control 
5 system which have a lower capacity and lower operating costs than the main elements. Thus, for 
example, if one side of a building is heated by the sun, it may be more efficient to employ an 
auxiliary device which suitably affects, i.e. compensates, only a part of the building. If such 
equipment is installed, the aggregate efficiency of the system may be improved, even if the 
individual efficiency of an element is lower. Likewise, it may be preferable to run a 2 ton air 

10 conditioning unit continuously, rather than a 5 ton air conditioning unit intermittently. The 
present intelligent control allows a fine degree of control, making use of all available control 
elements, in an adaptive and intelligent manner. 

Returning to the situation of a door opening event, the system would take appropriate 
action, including: interruption of normal climate control until after the disturbance has subsided 

15 and normal conditions are achieved; based on the actual climatic conditions or predicted climatic 
conditions begin a climate compensation control, designed to maximize efficiency and also 
maintain climatic conditions during the disturbance, as well as return to normal after the 
disturbance; optionally, during the door opening disturbance, the system would control a 
pressure or flow of air to counterbalance a flow through the door, by using a fan, blower or other 

20 device, or halting such a device, if necessary. It is also noted that the climatic control system 

could also be outfitted with actuators for opening and closing doors and windows, or an interface 
with such other system, so that it could take direct action to correct the disturbance, e.g., by 
closing the door. The climate between the internal and external ambients may differ in 
temperature, humidity, pollutants, or the like, and appropriate sensors may be employed. 

25 It is thus realized that the concepts of using all available resources to control an event, as 

well as using a predictive algorithm in order to determine a best course of action and a desired 
correction are a part of the present invention. 

EXAMPLE 19 
30 REMOTE CONTROL HARDWARE 
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A remote control of the present invention may be constructed from, for example, a 
Micromint (Vernon, CT) RTC-LCD, RTC-V25 or RTC-HCll or RTC180 or RTC31/52, and 
RTC-SIR, in conjunction with an infrared transmitter and receiver, input keys and a compatible 
trackball, which may provide raw encoder signals, or may employ a serial encoder and have a 
5 serial interface to the processor module. A power supply, such as a battery, is used. The use, 
interfacing and programming of such devices is known to those skilled in the art, and such 
information is generally available from the manufacturer of the boards and the individual circuit 
elements of the boards. The function of such a remote control is to receive inputs from the 
trackball and keys and to transmit an infrared signal to the controller. 

10 The processor and display, if present, may provide added functionality by providing a 

local screen, which would be useful for programming feedback and remote control status, as well 
as compressing the data stream from the trackball into a more efficient form. In this case, certain 
of the extracted information may be relevant to the determination of the user level, so that 
information related to the user level would be analyzed and transmitted separately to the 

15 controller by the infrared transmitter. If the local LCD screen is used in the programming 

process, then the main controller would transmit relevant information to the remote display, by a 
reverse-channel infrared link. These components are known in the art, and many other types may 
also be used in known manner. 

In known manner, available personal digital assistants ("PDAs"), available from 3Com 

20 (Palm Pilot III), Microsoft Windows CE-based devices, Apple ("Newton" model 100, 1 10, 120), 
Tandy, Poquet, Sharp, Casio, AT&T (Eo 440), Hewlett-Packard, etc. may also be employed as a 
human interface device. 

. EXAMPLE 20 
25 MEDICAL DEVICE INTERFACE 

The interface and intelligent control of the present invention are applicable to control 
applications in medicine or surgery. This system may also be described with reference to the 
generic system drawings of Figs. 23 and 24. In this case, an operator identifies himself and 
enters information regarding the patient, through the interface 2305. The interface 2305 
30 automatically loads the profile 2406 of both the operator and the patient, if the device is used for 
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more than one at a time, and is connected to a database containing such information, such as a 
hospital central records bureau. The interface may be connected to various sensors, of the input 
device 2401, such as ambient conditions (temperature, humidity, etc.), as well as data from the 
patient, such as electrocardiogram (EKG or ECG), electromyograph (EMG), 
electroencephalogram (EEG), Evoked Potentials, respirator, anesthesia, temperature, catheter 
status, arterial blood gas monitor, transcutaneous blood gas monitor, urinary output, intravenous 
(IV), intraperitoneal (IP), Intramuscular (IM), subcutaneous (SC), intragastric or other types of 
solutions, pharmaceutical and chemotherapy administration data, mental status, movement, 
pacemaker, etc. as well as sensors and data sources separate from the patient such as lab results, 
radiology and medical scanner data, radiotherapy data and renal status, etc. Based on the 
available information, the interface 2405, using the simple input device and the display screen 
described above, presents the most important information to the operator, along with a most 
probable course of action. The user then may either review more parameters, investigate further 
treatment options, input new data, or accept the presented option(s). The system described has a 
large memory in the signal analysisng medical devices 



so 



and for applying artificial intelligence to assist in diagnosis, the present system allows for 
individualization based on both the service provider and the patient. Further, the present 
invention provides the improved interface for interaction with the system. 

It is further noted that, analogously to the library function discussed above, medical 
events may be characterized in the characterization unit 2407 and recorded by the plant 2404, 
that a recording of the data need not be reviewed in its entirety in order to locate a particular 
significant event, and the nature of this event need not be determined in advance. It is also noted 
that the compression feature of the recorder of the present invention could be advantageously 
employed with the large volume of medical data that is often generated. Medical data image data 
may be compressed as known in the art, by standard image compression techniques, and/or 
image compression techniques optimized for radiology, nuclear medicine and ultrasonography 
data. Other types of data may be compressed using lossless algorithms, or by various vector 
quantization, linear excited models, or fractal compression methods. It is finally noted that, 
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because of its ability to store and correlate various types of medical data in the characterization 
unit 2407, the system could be used by the operator to create notes and discharge summaries for 
patients, using the database stored in the local database 2413, as well as the user history and 
preferences 2406. Thus, in addition to saving time and effort during the use of the device, it 
would also perform an additional function, that of synthesizing the data, based on medical 
significance. 

In addition to providing the aforementioned intelligence and ease of use, the present 
example also comprises a control 2402, and may interface with any of the sensors and devices, 
performing standard control and alarm functions. However, because the present control 2402 is 
intelligent and has pattern recognition capability, in addition to full data integration from all 
available data sources, it may execute advanced control functions. For example, if the present 
control 2402 is interfaced to a controlled infusion pump for, e.g., morphine solution, in e.g., a 
terminally ill patient, then certain parameters must be maintained, while others may be flexible. 
For example, a maximum flow rate is established as a matter of practice as a safety measure; too 
high a flow rate could result in patient death. However, a patient may not need a continuous 
infusion of a constant dose of narcotic. Further, as the patient's status changes, the level of 
infusion may be advantageously altered. In particular, if the renal status of the patient were to 
change, the excretion of the drug may be impaired. Therefore, by providing the controller with a 
urinary output monitor, it could immediately suppress the morphine infusion as soon as the renal 
output is recognized as being decreased, and further indicate an alarm condition. Further, it may 
be advantageous to provide a diurnal variation in the infusion rate, to provide a "sleep" period 
and a period of heightened consciousness with correspondingly lower levels of narcosis. Where 
various tests, procedures or interviews are scheduled, an appropriate level of narcosis and/or 
analgesia may also be anticipatorily provided at an appropriate time. 

As another example of the use of the present device as a medical controller, the control 
2402 could be interfaced with a cardiac catheter monitor, as a part of the signal receiver 2408. In 
such a case, normally, alarms are set based on outer ranges of each sensor measurement, and 
possibly a simple formula relating two sensor measurements, to provide a useful clinical index. 
However, by incorporating the advanced interface and pattern recognition function of the present 
invention, as well as its ability to interface with a variety of unrelated sensors, the present device, 
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including the present control, may be more easily programmed to execute control and alarm 
functions, may provide a centralized source of patient information, including storage and 
retrieval, if diverse sources of such information are linked, and may execute advanced, adaptive 
control functions. The present control 2402 is equipped to recognize trends in the sensor data 
from the signal receiver 2408, which would allow earlier recognition and correction of various 
abnormal conditions, as well as recognizing improvements in conditions, which could allow a 
reduction in the treatment necessary. Further, by allowing a fine degree of control, parameters 
may be maintained within optimal limits for a greater percentage of the time. In addition, by 
monitoring various sensors, various false alarms may be avoided or reduced. In particular, false 
alarms may occur in prior art devices even when sensors do not indicate a dangerous condition, 
merely as a safety precaution when a particular parameter is out of a specified range. In such a 
case, if a cause of such abnormal condition may be identified, such as patient movement or the 
normal activities of the patient's caretakers, then such condition may be safely ignored, without 
indicating an alarm. Further, even if a sensor parameter does in and of itself indicate a dangerous 
condition, if a cause, other than a health risk, may be identified, then the alarm may be ignored, 
or at least signaled with a different level of priority. By providing an intelligent and active filter 
for false alarm events, the system may be designed to have a higher level of sensitivity and 
specificity to real health risks, and further to provide a finer level of control based on the sensor 
readings, with fewer false positive readings. 

EXAMPLE 21 

SECURITIES TRADING TERMINAL INTERFACE 

The present invention is also of use in automated securities, debt, variable yield and 
currency trading systems, where many complex functions are available, yet often a particular 
user under particular circumstances will use a small subset of the functionality available at a 
given time. Such a situation would benefit from the present interface, which provides adaptive 
user levels, prioritized screen information presentation, and pattern recognition and intelligent 
control. A securities trading system is disclosed in U.S. Patent 5,034,916, for a mouse driven 
Fast Contact Conversational Video System, incorporated herein by reference. The present 
system relates primarily to the user terminal, wherein the user must rapidly respond to external 
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events, in order to be successful. In such a case, the advantages of the application of an interface 
according to the present invention are obvious, and need not be detailed herein. However, the 
pattern recognition functions of the present invention may be applied to correspond to the desired 
actions of the trader, unlike in prior intelligent trading systems, where the terminal is not 
individually and adaptively responsive to the particular user. Thus, the system exploits the 
particular strengths of the user, facilitating his actions, including: providing the desired 
background information and trading histories, in the sequence most preferred by the user; 
following the various securities to determine when a user would execute a particular transaction, 
and notifying the user that such a condition exists; monitoring the success of the user's strategy, 
and providing suggestions for optimization to achieve greater gains, lower risk, or other 
parameters which may be defined by the user. Such a system, rather than attempting to provide a 
"level playing field" to all users of like terminals, allows a user to use his own strategy, providing 
intelligent assistance. By enhancing the interface, a user becomes more productive with fewer 
errors and fastr training. 

EXAMPLE 

FRACTAL THEORY PATTERN RECOGNITION 

Affine transforms are mathematical manipulations of data in two dimensions, wherein the 
manipulation comprises a rotation, scaling and a displacement for each of the two coordinates. 
Schroeder, M, Fractals, Chaos, Power Laws, W.H. Freeman & Co., New York (1991). Of 
course, Affine transforms of higher dimensionality may also be employed. In describing an 
image using Affine transforms, the degree of matching between an image and the mathematical 
description of that image may be related by a number of iterations, and the fewer the iterations, 
the less data used to describe the image. Of particular importance in the field of graphics is the 
speed of "convergence", i.e., that a relatively few iterations are necessary in order to describe an 
image with sufficient precision to be visually useful. Therefore, the Affine transform 
mathematical specifications may be far more compact than the raw image data, and these 
specifications compare favorably to other types of image compression, such discrete cosine 
transformation (DCT) compression schemes, including JPEG, depending on a number of factors. 
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Because the Aff ine transform may be used to produce a compact visual description of an 
image, among other reasons, the present invention applies this transform to a pattern matching 
system for analyzing image contents. 

Pattern recognition, in this case, may proceed on an image basis, to match similar images, 
or on an object basis, in which portions of images are matched. It is preferred that the pattern 
matching system be robust, i.e., tolerant of various alterations of an image, artifacts, interference 
and configurational changes, while specific enough to allow object differentiation. 

In the case of video images, therefore, it is preferred that various two-dimensional 
projections of three-dimensional objects, in various "poses", be classified the same. This 
therefore requires that, in analyzing a two-dimensional image, the object be extracted from a 
background image and separated from other objects. Further, degrees of freedom may be 
determined, such as through analysis of a sequence of frames to reveal relative motion or change 
of portions of the object with respect to other portions. Finally, the object in the image must be 
compared to three dimensional models, through various projections. 

In the case of two dimensional image analysis, the image should be analyzed according toltObOs 10.00 
a robust starting criteria, so that the similarity of images may be determined by comparison of 
normalized Affine transformation coefficients. 

Fractal analysis, the study of self -similarity, and a superset of Affine transformation, 
allows a compact representation of an image or an object in an image, and due to its 
encompassing of various spatial relationships of object parts, allows normalized transforms to be 
compared. In other words, assuming that the object is extracted from a background scene, and 
various degrees of freedom are identified, an Affine transformation may be applied, which will 
yield a similar result for an image of the same object in a different "pose", i.e., with different 
exercise of its degrees of freedom. While in general, Affine transformations are described with 
respect to two-dimensional images, these may also be applied to three dimensional images. 
Thus, if a triangular polygon is rotated, scaled and displaced in a two dimensional image, a 
tetrahedron is rotated, scaled and displaced in a three dimensional system. Further, analogies 
may also be drawn to the time dimension (although geometric forms which are rotated, scaled 
and displaced over time are not given trivial names). Because, in a contractive Affine 
transformation (one in which the scaling factor of successive iterations is less than 1), continued 
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iterations are less significant, objects described with varying level of detail may be compared. 
Even images which are not normalized may still be compared, because at every level of the 
transform, slight changes in rotation, scale and displacement are accounted for. 

According to the present invention, nonlinear self-similarity may also be used. Further, 
in objects having more than two dimensions, linear scaling other than rotation, scaling and 
displacement may be described. 

It is noted that many types of optical computers, especially those including holographic 
elements, employ transformations similar to Affine transformations. Therefore, techniques of the 
present invention may be implemented using optical computers or hybrid optical-electronic 
computers. 

Thus, according to the present invention, the fractal method employing Affine transforms 
may be used to recognize images. This method proceeds as follows. A plurality of templates are 
stored in a memory device, which represent the images to be recognized. These templates may 
be preprocessed, or processed in parallel with the remainder of the procedure, in a corresponding 
manner. Image data, which may be high contrast line image, grayscale, -or having a full colorOslO.OOvlP 
map, the greyscale being a unidimensional color map, is stored in the data processor, provided 
for performing the recognition function. 

The image is preprocessed to extract various objects from the background, and to separate 
objects. This preprocessing may be performed in standard manner. The method of U.S. Patent 
No. 5,136,659, incorporated herein by reference, may also be used. As a part of this 
preprocessing, a temporal analysis of the object through a series of image frames, is performed to 
provide four dimensional data about the object, i.e., the two dimensions from the image, a third 
image imputed from differing perspective views of the object, and time. Certain objects may be 
immediately recognized or classified, without further processing. Further, certain objects, 
without full classification or identification, may be "ignored" or subjected to a lesser level of 
final processing. During the classification processing, various objects may be selected for 
different types of processing, for example, people, automobiles, buildings, plants, etc. 

After classification, and temporal analysis, an object for further processing is analyzed for 
degrees of freedom, i.e., joints of a person, moving parts of an object, etc. These degrees of 
freedom may then be corrected, e.g., the object itself altered, to change the image into a standard 
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format, or the degree of freedom information processed with the object to allow mathematical 
normalization without actual change of the image. 

The information describing the object image is stored. A plurality of addressable 
domains are generated from the stored image data, each of the domains representing a portion of 
5 the image information. As noted above, the entire image need not be represented, and therefore 
various objects separately analyzed. Further, only those parts of the image or object necessary 
for the recognition, need be analyzed. While it may be unknown which image components are 
unnecessary, sometimes this may be determined. 

From the stored image data, a plurality of addressable mapped ranges are created, 
10 corresponding to different subsets of the stored image data. Creating these addressable mapped 
ranges, which should be uniquely addressable, also entails the step of executing, for each of the 
mapped ranges, a corresponding procedure upon the one of the subsets of the stored image data 
which corresponds to the mapped ranges. Identifiers are then assigned to corresponding ones of 
the mapped ranges, each of the identifiers specifying, for the corresponding mapped range, a 
15 procedure and a address of the corresponding subset of the stored image a. 

To ensure comparability, the processing treatment of the template and the image data are 
analogous. Of course, template data may be stored in preprocessed form, so that the image data 
need only be processed according to the same rules. The domains are optionally each subjected 
to a transform, which may be a predetermined rotation, an inversion, a predetermined scaling, 
20 and a displacement. Because of the nature of these linear superposable transforms, the earliest 
iterations will include data about gross morphology, later iterations will include data about 
configuration, and latest iterations will include data about texture. 

In addition, nonlinear alterations, and frequency, Gabor or wavelet transform 
preprocessing may be applied. A warping or other kind of transform may also be applied. These 
25 types of transforms are generally not included in Affine transform analysis, yet judiciously 
applied, may produce more rapid convergence, greater data storage efficiency, computational 
advantages or pattern matching advantages. 

This transform is used to optimize the procedure, and also to conform the presentation of 
the image data with the template, or vice versa. Each of the domains need not be transformed the 
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same way, and in fact it is the transform coefficients which are stored to describe the transformed 
object, so that differences in coefficients relate to differences in objects. 

For each of the domains or transformed domains, as may be the case, the one of the 
mapped ranges which most closely corresponds according to predetermined criteria, is selected. 
5 The image is then represented as a set of the identifiers of the selected mapped ranges. 

Finally, from the stored templates, a template is selected which most closely corresponds 
to the set of identifiers representing the image information. This matching process is optimized 
for the data type, which is a string of iterative transform coefficients, of a contractive transform. 
It is preferred that, for each domain, a most closely corresponding one of the mapped 
10 ranges be selected. By performing analogous operations on a template and an unrecognized 
object in an image, a correspondence between the two may be determined. Thus, libraries of 
template image portions may be provided, with associated transform information, which may 
increase the computational efficiency of the system. 

In selecting the most closely corresponding one of the mapped ranges, for each domain, 
15 the mapped range is selected which is the most similar, by a method which is appropriate, and 
may be, for example, secting minimum Hausdorff distance from the domain, selecting the 
highest cross-correlation with the domain, the minimum mean square error with the domain and 
selecting the highest fuzzy correlation with the domain, based on rules which may be 
predetermined. Neural network energy minimization may also yield the best fit, and other 
20 techniques may also be appropriate. 

In particular, the step of selecting the most closely corresponding one of mapped ranges 
according to the minimum modified Hausdorff distance includes the step of selecting, for each 
domain, the mapped range with the minimum modified Hausdorff distance calculated as 
D[db,mrb] + D[l - db,l - mrb], where D is a distance calculated between a pair of sets of data 
25 each representative of an image, db is a domain, mrb is a mapped range, 1 - db is the inverse of a 
domain, and 1-mrb is an inverse of a mapped range. 

It is important that the selection criteria be tolerant to variations of the type seen in image 
data, e.g., video, so that like objects have similar transforms. Thus, the selection criteria is not 
particularly directed to optimal data compression, although the two criteria may coincide for 
30 some types of data. 
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In the case where the digital image data consists of a plurality of pixels, each having one 
of a plurality of associated color map values, the method includes a matching of the color map, 
which as stated above, encompasses a simple grey scale, natural color representation, and other 
color types. In such a case, the method is modified to optionally transform the color map values 
of the pixels of each domain by a function including at least one scaling function, for each axis of 
the color map, each of which may be the same or different, and selected to maximize the 
correspondence between the domains and ranges to which they are to be matched. For each of 
the domains, the one of the mapped ranges having color map pixel values is selected which most 
closely corresponds to the color map pixel values of the domain according to a predetermined 
criteria, wherein the step of representing the image color map information includes the substep of 
representing the image color map information as a set of values each including an identifier of 
the selected mapped range and the scaling functions. The correspondence method may be of any 
sort and, because of the added degree of complexity, may be a different method than that chosen 
for non-color images. The method of optimizing the correspondence may be minimizing the 
Hausdorff distance or other "relatedness" measurement between each main and the selected 
range. The recognition method concludes by selecting a most closely corresponding stored 
template, based on the identifier of the color map mapped range and the scaling functions, which 
is the recognized image. 

Color information may have less relevance to pattern recognition than, for example, edge 
information, and therefore may be subjected to a lesser degree of analysis. The color information 
may also be analyzed separately, using a different technique. 

EXAMPLE 24 
IMAGE ANALYSIS 

Alternatively to the object extraction, the image as a whole may be analyzed. In the case 
of moving images, the aforementioned method is further modified to accommodate time varying 
images. These images usually vary by small amounts between frames, and this allows a 
statistical improvement of the recognition function by compensating for a movement vector, as 
well as any other transformation of the image. This also allows a minimization of the processing 
necessary because redundant information between successive frames is not subject to the full 
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degree of processing. Of course, if the image is substantially changed, then the statistical 
processing ceases, and a new recognition function may be begun, "flushing" the system of the 
old values. The basic method is thus modified by storing delayed image data information, i.e., a 
subsequent frame of a moving image. This represents an image of a moving object differing in 
5 time from the image data in the data processor. 

A plurality of addressable further domains are generated from the stored delayed image 
data, each of the further domains representing a portion of the delayed image information, and 
corresponding to a domain. Thus, an analogous transform is conducted so that the further 
domains each are corresponding to a domain. A plurality of addressable mapped ranges 

10 corresponding to different subsets of the stored delayed image data are created from the stored 
delayed image data. The further domain and the domain are optionally matched by subjecting a 
further domain to a corresponding transform selected from the group consisting of a rotation, an 
inversion, a scaling, and a displacement, which corresponds to a transform applied to a 
corresponding domain, and a noncorresponding transform selected from the group consisting of a 

15 rotation, an inversion, a scaling, a translation which does not correspond to a transform applied to 
a corresponding domain. For each of the further domains or transformed further domains, the 
one of the mapped ranges is selected which most closely corresponds according to predetermined 
criteria. As stated above, these domains may also be subjected to corresponding and 
noncorresponding frequency domain processing transforms,Gabor transforms, and wavelet 

20 transforms. 

A motion vector is then computed between one of the domain and the further domain, or 
the set of identifiers representing the image information and the set of identifiers representing the 
delayed image information, and the motion vector is stored. The further domain is compensated 
with the motion vector and a difference between the compensated further domain and the domain 

25 is computed. For each of the delayed domains, the one of the mapped ranges is selected which 
most closely corresponds according to predetermined criteria. The difference between the 
compensated further domain and the domain is represented as a set of difference identifiers of the 
selected mapping ranges and an associated motion vector. 

This method is described with respect to Figs. 27, 28 and 29. Fig. 27 is a basic flow 

30 diagram of the recognition system of the present invention. Fig. 28 provides a more detailed 
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description, including substeps, which are included in the major steps shown in Fig. 27. 
Basically, the image, or a part thereof, is decomposed into a compressed coded version of the 
scene, by a modified fractal-based compression method. In particular, this differs from the prior 
compression algorithms in that only a part, preferably that part containing objects of interest, 
5 need be fully processed. Thus, if a background is known (identified) or uninteresting, it may be 
ignored. Further, the emphasis is on matching the available templates to produce an image 
recognition, not achieving a high degree of compression. Therefore, the image, or domains 
thereof, may be transformed as required in order to facilitate the matching of the templates. As 
with respect to single images, the templates are represented in analogous form, having been 

10 processed similarly, so that a comparison of the relatedness of an object in an image and the 
templates may be performed. In particular, if an oblique view of an object is presented, then 
either the object may be transformed to achieve a predicted front view, or the template 
transformed or specially selected to correspond to the oblique view. Further, once a recognition 
has taken place with a high degree of certainty, the system need only ensure that the scene has 

15 not changed, and need not continually fully process the data. This has implications where 

multiple recognition processes are occurring simultaneously, either in a single ot meet that required 

for de novo real time recognition of all aspects of all the objects or images. 

In order to limit processing of portions of images, exclusionary criteria may be applied 

20 which allow truncation of processing when it is determined that an option is precluded or there 
exists a significantly higher probability alternative. The processing system may use primarily 
exclusionary criteria to select the best predictions, or after preselection, employ a highest 
probability selection system on the remaining choices. 

Fig. 30 shows a flow diagram of a cartoon-like representation of an image recognition 

25 method of the present invention. It shows initially, an input image 3001, having a degree of 

complexity. A windowing function 3002 isolates the object from the background. A first order 
approximation of the image is generated 3003, here called a mapping region. The first order 
approximation is then subtracted from the initial image to produce a difference 3004. The first 
order error is then subjected, iteratively, to successive transform and subtract operations 3005 

30 and 3006, until the error is acceptably small, at which point the input image is characterized by a 
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series of codes, representing the first order approximation and the successive transforms, which 
are stored 3008. These codes are then compared with stored templates 3009. The comparisons 
are then analyzed to determine which template produces the highest correlation 3010, and the 
match probability is maximized 301 1. The recognized image is then indicated as an output 3012. 

This system is shown in Fig. 26, wherein a sensor 2602 provides data, which may be 
image data, to a control 2601. The control 2601 serves to control the plant 2603, which has an 
actuator. The plant 2603 may be a VCR or the like. The control 2601 has associated with it an 
intermediate sensor data storage unit 261 1, which may be, for example a frame buffer or the like. 
The control 2601 also has associated with it a transform engine 2612, which may perform a 
reversible or irreversible transform on the data or stored data. 

The system also has a template input 2610, which may receive data from the sensor 2602, 
if accompanied by identifying information. Thus, the pattern storage memory 2609 stores a 
pattern, such as an image pattern, along with an identifier. 

The control 2601 also has an input device 2604, an on-screen display interface 2605, and 
a program memory 2606, for inputting instructions from a user, providing feedback to the user, 
and recording the. result of the user interaction, respectively. Finally, a characterization network 
2607 characterizes the sensor 2602 data, which may be provided directly from the sensor 2602 or 
preprocessing circuitry, or through the control 2601. A correlator 2608 correlates the output of 
the characterization network with the stored patterns, representing the templates from the 
template input 2610. The system therefore operates to recognize sensor patterns, based on the 
correlator 2608 output to the control 2601. 

When analyzing objects in a sequence of images, a determination is made of the 
complexity of the difference based on a density of representation. In other words, the error 
between the movement and transform compensated delayed image and the image is quantified, to 
determine if the compensation is valid, or whether the scene is significantly changed. When the 
difference has a complexity below a predetermined or adaptive threshold, a template is selected, 
from the stored templates, which most closely corresponds or correlates with both the set of 
identifiers of the image data and the set of identifiers of the delayed image data, thus improving 
recognition accuracy, by allowing a statistical correlation or other technique. The threshold may 
be set based on an error analysis of the system to determine statistical significance or using other 
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criteria. The threshold may also be adaptively determined based on the history of use of the 
machine and feedback. For example, if the two images both have a high correlation with one 
template, while a first of the images has a slightly higher correlation with another template, while 
the second image has a much lower correlation with that other template, then the system would 
score the first template as a better match to the first image, based on this differentiation. Thus, 
templates may be particularly selected to best differentiate similar images of objects. 

EXAMPLE 25 

PATTERN RECOGNITION SYSTEM 

The present system allows for the use of a pattern recognition subsystem for a controller 
which acts in accordance with a detected pattern. In image, audio and multimedia applications, 
different types of image processing may take place. First, various processing algorithms may 
take place in parallel, with an optimum result selected from the results of the various algorithms. 
Further, various processing schemes may be applied in sequence, with differing sequences 
applied to different data streams. These processing schemes may be commutative, i.e. yield 
approximately the same result regardless of the processing order, or may be highly order 
dependent, in which case a processed data stream must include information relating to the 
sequence of processing for interpretation. 

Various exemplars may reside in a fragment library, for comparison with unidentified 
data. In the case of processing path dependent systems, an exemplar may be found in multiple 
forms based on the processing procedure, or in a small subset of corresponding libraries. In 
general, both lossless compression methods and lossy compression methods employed using 
high fidelity parameters to minimize loss may be processed to produce a relatively or almost 
unique result for each unknown data set, while lossy compression or processing methods will be 
particularly procedure sensitive, especially if differing strategies are employed. These differing 
strategies may be used to emphasize different features of the unknown data set in order to 
facilitate comparison. This technique is especially useful when the processing procedures are 
run in parallel, so that the latency penalty for redundant processing is minimized. Techniques 
available for this processing include vectorization, fractal processing, iterated function systems, 



Hoffberg et al. 



- 193- 



LIH-10 



spatial frequency processing (DCT- JPEG, MPEG, etc.), wavelet processing, Gabor transforms, 
neural nets (static or sequence of images), and other known techniques. 

In a preferred embodiment, a spatial frequency or wavelet processing step is performed 
first, on static image data or a sequence of images, with a fractal domain processing step 
5 performed thereafter. This allows high frequency noise to be initially filtered; with subsequent 
fractal-based correlated noise detection and subtraction, therefore allowing cleanup without loss 
of high frequency detail. Preferably, before the fractal-based processing, which may be 
performed by a digital computer or optical processing apparatus, standard edge detection/object 
separation, e.g., high frequency filtering, contour mapping, artificial intelligence, etc. may be 

10 performed. A fractal transform is then performed on the image of a portion thereof, starting in a 
standardized manner, e.g. at a point of lowest complexity, or the epicenter of the largest feature 
for beginning a contractive transform. The processed image may then be matched with one or 
more databases to identify all or a portion of the image. Optionally, after a match has been found 
and/or confirmed by an operator, using the human interface system, the method is then optimized 

15 to minimize the errors and increase the efficiency of later matches. This may be performed by 
modifying the database record, or related records, as well as modifying the preprocessing 
algorithm. In a preferred embodiment, the image is processed piecemeal, on an object-by-object 
basis. Therefore, after an object has been processed, it is extracted from the image so that the 
remaining information may be processed. Of course, multiple objects may be processed in 

20 parallel. The exemplar database is preferably adaptive, so that new objects may be added as they 
are identified. 

The present technology may also be used with a model-based exemplar database, wherein 
an image object is matched, based on a two dimensional projection, or analysis of a sequence of 
images, with a multidimensional model of an object. For example, the model may include 
25 volume, as well as multiple degrees of freedom of movement. Further, objects may also include 
"morphing" characteristics, which identify expected changes in an appearance of an object. 
Other types of characteristics may be included in conjunction with the exemplar in the database. 

In a preferred embodiment, a model contained in a database includes a three or more 
dimensional representation of an object. These models include information processed by a 
30 fractal-based method to encode repetitive, transformed patterns in a plane, space, time, etc., as 
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well as to include additional degrees of freedom, to compensate for changes in morphology of 
the object, to allow continuous object identification and tracking. Thus, once an object is 
identified, an expected change in that object will not necessitate a reidentification of the object. 
According to one embodiment, a fractal-like processing is executed by optical elements of an 
optical or optical hybrid computer. Further, in order to temporarily store an optical image, 
optically active biological molecules, such as bacteriorhodopsins, etc. may be used. Liquid 
crystals or other electrophotorefractive active materials may also used. These imagers may be 
simple two dimensional images, holograms, or other optical storage methods. A preferred 
holographic storage method is a volume phase hologram, which will transform an impressed 
image, based on hologram to image correlation. Thus, these models would be somewhat linear 
transform independent, and would likely show some (planar) transform relationship. Thus, an 
optical computer may be advantageous because of its high computational speed as compared to 
digital computers for image analysis, due to inherent parallelism and high inherent speed. 

Because of the present limitations in speed of writing an image to optical recording 
media, especially holographic images, the preferred system includes a plurality of image storage 
elements, which are operated in parallel. It is noted that absolute accuracy of object 
identification is not required for "consumer" applications, and therefore partial match results may 
be considered useful. A plurality of partial results, when taken together, may also increase 
identification reliability. Critical applications generally differ in quantitative aspects rather than 
qualitatively, and therefore many aspects of the present invention may be applied to mission 
critical and other high reliability applications. 

A preferred object identification method proceeds by first classifying an object in an 
image, e.g., "car", "person", "house", etc. Then, based on the classification and object separation, 
an optimized preprocessing scheme is implemented, based on the classification. This 
classification preprocessing operates on the raw image data relating only to the object, separated 
from the background. Then, after the optimized preprocessing, a parallel recognition system 
would operate to extract unique features and to identify common features to be excluded from the 
comparison. This step could also identify variable features upon which identification should not 
be made because the distinctions are useless for the purpose. Thus, the object image at this point 
loses its relationship to the entire image, and the data reduction might be substantial, providing a 
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compact data representation. The preferred algorithm has a tree structure, wherein the 
identification need only differentiate a few possibilities, and pass the result to another branch of 
the tree for further analysis, if necessary. Since the intermediate calculations may help in later 
computations, these should preferably be retained, in order to avoid duplicative analysis. 
5 Further, the order of analysis should be predetermined, even if arbitrary, so that once a useful 
intermediate calculation is identified, it may be passed in a regular, predictable manner to the 
next stage processing. Of course, one should not ignore that objects in the entire image may be 
correlated with one another, i.e. if one object is present, it would increase or decrease the 
likelihood of another object also being present. Further, temporal correlations should also be 

10 noted. Thus, the object identification need not proceed upon each object independently. 

Based on time sequences of two-dimensional images, a three dimensional image 
representation may be constructed. Alternatively, based on various presumptions about 
extractable "objects" in a single or small group of two dimensional images, a hypothetical three 
dimensional object may be modeled, which may be later modified to reflect the actual image 

15 when an actual view of hidden surfaces is shown. Therefore, by one means or another a three 
dimensional model is created, having both volume and surface characteristics. Of course, since 
inner structure may never be seen, the model normally emphasized the surface structure, and is 
thus a so-called two-and-a-half dimensional surface model. Other non-integral dimension 
representations may also be useful, and fractal models may efficiently represent the information 

20 content of an image model. 

When the source signal is an MPEG encoded datastream, it is advantageous to provide an 
exemplar database which does not require complete expansion of the encoded signal. Thus, the 
motion vector analysis performed by the MPEG encoder may form a part of the pattern 
recognition system. Of course, image sequence description formats other than MPEG may be 

25 better suited to pattern analysis and recognition tasks. For example, a system may transmit an 
interframe, by any suitable description method, as well as an object decomposed image in, e.g., 
fractal transform codes. The transmitted source material, other than interframes, is then 
transmitted as changes only, e.g. new objects, transforms of existing objects, translations of 
existing objects, etc. 
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Color coding may use even more extensive use of fractal compression technology with 
high compression ratios, because absolute accuracy is not necessary; rather photorealism and 
texture are paramount, and need not be authentic. Therefore, backgrounds with significant detail, 
which would require substantial data in a DCT type system, could be simply coded and decoded 
5 without loss of significant useful information. Important to the use of this method is to 

discriminate between background textures and foreground objects, and to encode each separately, 
optimizing the processing based on the type of object being processed. 

EXAMPLE 26 

10 DATA CONTEXT SENSITIVE COMPUTER INTERFACE 

The present example relates to a context sensitive computer interface in which a 
characteristic of the interface is modified based on a linguistic or informational content of a data 
object upon which the interface is operating. For example, a number of alternate feature sets 
may be made available based on the type of data which is being operated on by the user. For 

15 example, differing feature sets would be optimal for each scientific discipline, each type of 

financial or economic field, marketing, retail, distribution, manufacturing, administration, human 
resources, etc. Such an interface will make it possible to provide an extended and extensible 
suite of application modules customized for the user in general, and further adaptive to the 
particular use to which the user may be making of the apparatus. Thus, complex options 

20 particularly suited for the data at hand may be made available without inefficient interface 
searching, while inappropriate options are not presented. It is noted that this interface is 
responsive to the data, rather than the programming. Further, the data is analyzed for its 
meaning, rather than its type. 

In a word processing environment, a document or section of a document is analyzed for 

25 the presence of particular words or phrases, or for the presence of concepts, interpretable by 

linguistic concepts. This context-sensitive functionality does not require an explicit definition by 
the user, but rather will be present even during an incidental occurrence of a recognized context. 
In accordance with other aspects of the present invention, each context related function may have 
various user levels, which are selected based on an imputed user level of the user. Thus, the 
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interface program must actually interpret the text or context of the user document in order to 
select the most likely options for use. 

Thus, if a user were to embed a table in a document, the available options would change 
to table-type options when the "active" portion of the document is at the table, i.e. within the 
viewable area, etc. Further, and more specifically, if the text and context of the table indicate 
that this is a financial table, financial options would be initially provided, and standard financial 
calculation functions immediately made available or performed, in contemplation of their 
prospective use. Similarly, if the data appears to be scientific, a different set of options would be 
initially available, and the standard scientific-type calculation functions be made available or 
performed. If the table relates to chemical or mechanical-type data, chemical or mechanical 
options might be made available, respectively. Embedded graphics, likewise, would be 
associated with graphics functions appropriate to the type of graphic. It is noted hat, due to the 
analysis of the content of the document, software having generic functionality may present as 
special purpose software, based on its actual use. 

Thus, in a like manner, the system could determine the "style" of the document and 
automatically format the data in a predetermined manner to conform with general standards of 
presentations relating to the desired style. This is similar to style sheets of many programs, but 
they are self applying, and will, within the same document, be adaptive as the data changes 
context. Further, since the "styles" would be applied automatically, it would be relatively easy to 
alter them, requiring only a small amount of manual effort. This is so because the "keys" by 
which the system determines style could be stored, thus allowing redeterminations to be easily 
made. This context sensitivity could also assist in spelling and grammar checking, where 
different rules may apply, depending on the context. 

The data object includes information, which might be text, arrays of numbers, arrays of 
formulas, graphics, or other data types. The system relates parts of the object to each other by 
"proximity" which could be linear, in the case of a text document, or otherwise, such as in the 
case of a hypertext document or spreadsheet. Those parts or elements of the object closest to 
each other, by whatever criteria, are presumed to be topically related, regardless of data type. 
Thus, if a paragraph of text is proximate to a table for numbers, then the type of numbers 
presumed to occupy the table would relate to the content of the proximate text. If the text relates 



Hoffberg et al. 



-198- 



LIH-10 



to finance, i.e. uses financial-related terms, or series of words that often occur in financial 
contexts, the table would be presumed to be a financial table. 

Once the context of the part of the object is determined, the system then acts based upon 
this context. The major act is the presentation of tailored menus. This means that if the context 
is financial, the menus available for use with the numeric table relate to financial tables or 
spreadsheets. Further, the proximate text would be subject to financial oriented spellcheck and 
financial oriented grammar or style check. If a graphics-option is selected proximate to the text 
and table, the menu options would presume a financial graph and present appropriate choices. Of 
course, the options need not be limited to a few types, and may be hybrid and/or adaptive to the 
style of the user. However, it is noted that the adaptive menus couldbe linked to a "corporate 
style". Thus, communication styles could be dictated by a set of global rules for an organization. 
Of course, these a priori choices could be overridden. 

An advantage of this system is that it allows a software system to include a wide range of 
functionality which remains "buried", or relatively inaccessible, based on the context of usage. 
Thus, feature rich software would be considered more usable, and software could be provided in 
modular fashion. Since the system might allow a user to have potential access to many software 
modules, the system could also be linked to a license manager and per use billing system for 
rarely used modules, while allowing these to remain available on, e.g., a CD ROM. Thus, for 
example, a full integrated package could employ a single, "standard" interface which would not 
require task-switching programs, while avoiding presentation of the full range of features to the 
user at each juncture. 

This system provides advantages over traditional systems by providing a non- 
standardized interface with a variable feature set which attains usability by adapting a subset of 
the available functionality based on the context of the data. 

EXAMPLE 27 

GROUP AWARE ADAPTIVE COMPUTER INTERFACE 
The adaptive interface according to the present invention may be used in group 
computing applications. In such a case, the predictive functionality is applied to allow the 
interface to apply rules from one group member to a project, even when that group member has 
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not contributed personally to a particular aspect This is thus a type of intelligent agent 
technology, which, according to the present invention includes the characteristics of abstraction 
and extrapolation, rather than rule based analysis which would fail based on divergent 
circumstances. This differs from standard rule-based expert system because the intelligence 
5 applied is not necessarily "expert", and may be applied in a relative fashion. Further, extracted 
user characteristics need not completely define a solution to a problem, and indeed, the use of 
such a technology in group situations presupposes that a contribution of a number of users is 
desirable, and therefore that the expertise of any given user is limited. 

In order to ensure data integrity after the application or contingent application of user 

10 characteristics to a datastream, it is desirable to trace the evolution of data structures. This also 
allows for assistance in the organization and distribution of workgroup responsibilities. Thus, in 
a workgroup situation, the goal is not optimization of individual productivity, but rather 
optimization of the group result, including all levels of review after an initial phase is complete. 
Thus, while an individual user may seek various shortcuts to achieve various results, the 

15 group would benefit by having available all information relating to the path taken to achieve that 
result. Further, the desired result may be modified according to the presumed actions of the 
group, so that the final product is pre-optimized for the group, rather than the individual. Thus, a 
group member may have his "rules" extracted from his actions, i.e. by neural net 
backpropagation of errors programming or fuzzy rule definition, to be presented for 

20 consideration by another group member. This strategy will allow "better" drafts by considering 
the predicted input of a member prior to review by that member. A user may further tailor the 
rules for a given project, and "distilled wisdom" from non-group members may also be 
employed, as in normal expert (AI) systems. 

This rule-extraction technology as applied to workgroups is enhanced by the context 

25 sensitivity of the software, where the input of each group member may be weighted by 

considering the context. Again, this technique may be used to increase the efficiency of the 
primary author of a section of a project, as well as better defining the scope of responsibility of 
each member, while still respecting the input of other group members. 

According to this workgroup rule extraction technology, points of conflict between group 

30 members are highlighted for resolution. As an adjunct to this resolution phase of a project, 
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videoconferencing may be employed. Further, where a conflict of a similar type had occurred in 
the past, data relating to the resolution of that conflict, including recorded videoconference, may 
be retrieved and presented to one or more members of the workgroup. In this way, such conflicts 
may be resolved before it becomes adversarial. Thus, each group member may efficiently 
5 proceed independently, with only major issues requiring meetings and the like to resolve. 

If a workgroup member disagrees with an imputed rule, either explicitly, by review of the 
rules, or implicitly, by a review of the results, the system will allow a review of all decisions 
influenced by that faulty rule, as well as a proposed correction. This may be addressed by any 
member of the group, but usually by the author of the section or the source of the rule will be the 
10 relevant reviewing individual. Rules may also be created by the group, rather than from a single 
individual. Such rules are more often explicitly defined, rather than derived from observation. 
Such group rules may also be subjected to adaptive forces, especially when overridden 
frequently. 

15 EXAMPLE 28 

ADAPTIVE INTERFACE VEHICULAR CONTROL SYSTEM 
It is noted that, the adaptive user level interface is of use in uncontrolled environments, 
such as in a moving vehicle, especially for use by a driver. An intelligent system of the present 
invention would allow the driver of such a vehicle to execute control sequences, which may 

20 compensate for the limited ability to interact with an interface while driving. Thus, the driver 
need not explicitly control all individual elements, because the driver is assisted by an intelligent 
interface. Thus, for example, if it begins raining, the interface would predict the windshield 
wipers should be actuated, the windows and any roof opening closed, and the headlights 
activated. Thus, the driver could immediately assent to these actions, without individually 

25 actuating each control. In such a case, the screen interface, which may be a heads-up display, 
would provide a small number of choices, which may be simply selected. Further, under such 
conditions, there would likely be a large amount of mechanical jitter from the input device, 
which would be filtered to ease menu selection. Further, this jitter would indicate an unstable 
environment condition, which would cause the interface to present an appropriate display. A 

30 voice input may also be used. 
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EXAMPLE 29 

ADAPTIVE INTERFACE VEHICULAR CONTROL SYSTEM 

An integrated electronics system for an automobile is provided having control over 
engine, transmission, traction control, braking, suspension, collision avoidance, climate control, 
and audio systems. Steering and throttle may also be controlled. Based on driver preference and 
action patterns, the system may optimize the vehicle systems. For example, the vehicle may 
anticipate voluntary or road conditions based on implicit inputs of the user, thus readying 
vehicular systems prior to the actual encounter with certain conditions. Further, a user interface 
may be simplified, based on probable required functionality, thus limiting required attention by 
the driver in order to activate a particular control. By providing such an interface, controls 
normally inaccessible may be made accessible, without increasing mechanical complexity, e.g., 
functions normly controlled by computer may be accessed through a common user interface, 
rather than through dedicated manual controls. 

The automobile control system may also include collision avoidance systems, which may 
include imaging sensors and radar or LIDAR ranging and velocity measurement. According to 
the present invention, a heads-up display or simplified graphic user interface in the dashboard or 
near the steering wheel presents predicted options to the driver. An auxiliary interface may also 
make certain options available for passengers. 

According to another aspect of the present invention, an automobile positioning system is 
provided, which may be extraterrestrial, e.g., GPS, or terrestrial, e.g., cellular base station, 
LORAN, etc. Such a system is described in U.S. Patent No. 5,390,125, incorporated herein by 
reference; see references cited therein. A controller in the automobile is provided with an 
itinerary for the vehicle travel. Based on position and itinerary, the vehicle may communicate 
with various services, such as food, fuel and lodging providers, to "negotiate" for business. The 
driver may be provided with customized "billboards", directed to his demographics. 
Reservations and discounts may all be arranged while en-route. Communication between the 
automobile and the services is preferably provided by CDPD services, which is a cellular based 
832 MHz band digital data transmission system. Therefore, an existing cell phone system or 
CDPD modem system may be employed for telecommunication. Preferably, a simple display is 
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provided for presentation of commercial messages to the driver or passenger and for interacting 
with the service. 

As a matter of practice, the service may be subsidized by the service providers, thus 
reducing the cost to the consumer. The extent of the subsidy may be determined by the amount 
of data transmitted or by the eventual consummation of the transaction negotiated. 

Because of the positioning system, any variance from the itinerary may be transmitted to 
the service providers, so that reservations may be cancelled, or substitute services provided in a 
different location or at a different time. 

The telecommunication system may also be used as an emergency system, to contact 
emergency services and/or police in the event of accident or distress. The transponder system 
may also be part of an antitheft system. The transponder may also be part of a vehicular 
maintenance and diagnostic system to ensure proper servicing and to help determine the nature of 
problems. Raw or processed data may be transmitted to a centralized station for full analysis and 
diagnosis. Because the vehicle need not be at the repair shop for diagnosis, problems may be . 
analyzed earlier and based on extensive, objective sensor data. 

EXAMPLE 30 

INTELLIGENT INTERNET APPLIANCE 

A further application of the present technologies is in a so-called "Internet appliance". 
These devices typically are electronic devices which have a concrete function (i.e., do more than 
merely act as a generic server) and typically employ at least as a secondary interface, a web 
browser. In addition, these devices provide a TCP/IP network connection and act as a web 
server, usually for a limited type of data. Therefore, in addition to any real human interface on 
the device, a web browser may be used as a virtual interface. 

According to the present invention, such an Internet Appliance is provided according to 
the present invention with advanced features, for example adaptivity to the user, to the 
environment, or intelligent algorithms which learn. In fact, a preferred embodiment provides a 
rather generic device which serves as a bridge between the Internet, a public packet switched 
network which employs TCP/IP, and a local area network, for example in a residential, industrial 
or office environment. The device may further abstract the interface functions for a variety of 
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other devices as nodes on either the Internet or local area network, to provide a common control 
system and interface. 

A preferred embodiment also encompasses certain other features which may be used as 
resources for the networked devices or as usable features of the device. 

The Internet, or other wide area network, may be connected in any known manner, for 
example, X.25/ISDN D-channel, dial-up over POTS (e.g., v.34, v.90, v.91), ISDN, xDSL, 
ADSL, cable modem, frame relay, Tl line, ATM, or other communications system. Typically, a 
system is provided with either a commonly used access method, such as v.90 or ISDN, or a 
replaceable communications module with a generic interface. Such systems are well known. 

The local area network is also well known, and may include, for example, as a physical 
layer, 10 Base T, 100 Base T, HomeRun (Cat. 3 twisted pair/telephone twisted pair/power line 
transmission, from Intel Corp., e.g., Intel 21145 device/Tut systems), Universal Serial Bus 
(USB), Fire wire (IEEE-1394), optical fiber, or other known computer network. The protocol 
may be, for example, TCP/IP, IPX, ATM, USB, IEEE-1394, or other known or proprietary 
appropriate communications protocol. 

While not required, a particular aspect of a preferred embodiment according to the 
present invention is the ability to interface "dumb" devices as nodes on the LAN with an 
intelligent device, while allowing thuser to interact primarily with the intelligent device. This 
scheme therefore reduces redundancy and increases functionality. 

Therefore, in an exemplary embodiment, an intelligent home is established, with most or 
all electrical appliances and electronic devices interfaced with the system, for example through 
the aforementioned Homerun system, using any of the supported physical layers. Each device is 
provided as a relatively simple control, for example, remotely controllable (or where applicable, 
dimmable) lights, control over normal use and peak electrical demand of heavy appliances, as 
well as inter-device communications for consumer electronics. Therefore, the intelligent device 
acts as an external communications and control node for the entire network, and may, for 
example, control telephony functions in addition. 

Exemplary devices to be controlled in a home include household appliances, HVAC, 
alarm systems, consumer electronics, and the like, and/or provide for communications purposes. 
An alarm system embodiment, for example, may employ a video camera input for capture and 
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analysis of images, as well as motion or irregularity detection. The intelligent device may, for 
example, employ neural networks or other intelligent analysis technology for analyzing data 
patterns indicative of particular states. An alarm output may be produced, for example, through 
standard alarms, as well as through a telephone interface of the system. 

The system may therefore set/control/monitor the status of any home-based device - 
oven, stove, alarm, washing machine, dryer, iron, lights, computer, oil/gas burner, thermostat, 
location of automobiles, camera, pump (pool, sump), sprinkler, stereo/video systems, home 
surveillance system. This may be especially important if the user is away from home for an 
extended period of time, or if he or she wants to change the schedule of something, or travel 
plans change. For a home surveillance system, pattern recognition may be employed to monitor 
all sensors, including cameras, to detect abnormal patterns or changes in condition. 

Thus, since the intelligent device incorporates a web server, the physical proximity of the 
user is not critical for interaction with the device, and all devices on the LAN may be controlled 
remotely, automatically, and in synchrony. 

In one embodiment, the intelligent device includes a videoconferencing/video capture 
system, including any or all known features for such systems, for example as described in the 
background of the invention. Therefore, in addition to a base level of functionality, such an 
embodiment would also likely include (a) telephony interface, (b) video capture, (c) video codec, 
(d) audio capture, (e) audio codec, (f) full duplex speakerphone, (g) video output, and (h) audio 
output. 

In another embodiment, a speech interface is provided for interpreting human speech as 
an input and/or producing synthesized speech as an output. Therefore, such a device would 
include speech recognition and/or synthesis technologies, as well as a semantic data processor. 

Preferable, the device allows use of a simplified web browser interface, such as which 
may be supported by personal digital assistants (PDAs) and enhanced digital data cellular 
telephones, e.g., handheld device markup language (HDML). This, for example, allows a remote 
user to communicate through wireless networks or the like, and therefore avoids the need for a 
full personal computer as a human interface. 

Advantageously, the device may be interfaced with a telephone communication system, 
allowing use as a voice and/or video message recorder, and allowing remote access to the stored 
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information, either through a dialup connection and/or through the network. In this case, the 
intelligent device may act as a computer telephony interface, and all communications devices 
Logically under this device act as "net phones", i.e., voice communications devices which 
communicate over data networks. Therefore, all telephony control and computer telephony 
5 functions may be integrated into the device, for example, voice mail, auto-attendant, call center, 
and the like. Further, the Internet interface allows remote messaging and control over the 
telephony system, as well as virtual networking, Internet telephony, paging functions, and voice 
and data integration. 

The intelligent device may also interface with various media electronics devices, and for 

10 example, may act as a "rights server" or other aspect of a copyright protection and royalty 

collection/enforcement system. Typically, these functions entail e-commerce functions, and may 
require X.22 and/or XML communications and translations. In addition, such functions also 
typically involve encryption/decryption, as well as key management, which are also preferably 
supported by the device. Such support may be in hardware or software. 

15 Another aspect of the invention provides an index and/or catalog database for media 

information or media metadata information. Thus, data relating to a VCR tape or other recorded 
media may be subjected to search criteria without requiring access or contemporaneous analysis 
of the media content itself. Therefore, a preferred embodiment of the intelligent device includes 
mass storage and retrieval capability, for example, magnetic disk, RW-CD, or RW-DVD. This 

20 mass storage and retrieval capability may be used, not only for databases, but also for computer 
software, media and content storage and retrieval. Thus, the device may also serve as a video 
data recorder, capturing video data and storing it digitally, for example, employing the 
aforementioned video and audio codecs. In this case, it is preferable that the intelligent device 
also include a direct media access port, for example a broadcast TV tuner, ATSC/HDTV tuner, 

25 cable tuner, DVD reader, CD reader, satellite video decoder, NTSC composite/S-VHS, and/or 

other type of media content information input. With such storage, the intelligent device may also 
assume the standard functions of computer network servers, for example, file serving, print 
serving, fax serving, application serving, client/server application support, as well as traditional 
networking functions, such as bridging, routing, switching, virtual private network, voice-over- 

30 IP, firewall functions, remote access serving, and the like. It should also be apparent that the 
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intelligent device may also serve as a personal computer itself, and thus does not require 
additional systems for basic functionality. 

In a media recording system embodiment, the system preferably notifies the user if the 
"program", i.e., instructions, are incomplete, ambiguous, or impossible to complete. For 
5 example, if a single channel selector is provided, no more than one channel may be monitored at 
a time. Further, where irreversible actions are necessary, the user is preferably informed and 
allowed to make a choice, for example, if lack of storage space forces a choice to be made 
between new and archival material. A conflict management system is provided which arbitrates 
between the conflicting demands, for example if a second user is programming the same device 
10 (for example, the VCR) to record a show at the same time,. 

Thus, it is apparent that the intelligent device according to this embodiment of the present 
invention may incorporate many different functions, some of which are defined purely by 
software and processing availability, and others by particular hardware devices for performing 
specific functions. 

15 Another aspect of the invention defines a special training mode of the intelligent device, 

which allows the user to improve the functionality of the system by ensuring that any intelligence 
algorithms will correctly operate in an anticipated and/or desired manner, n this mode, 
responses of the user are provoked which indicate user preferences, preferably in a manner which 
resolves ambiguities encountered with prior data sets. Thus, where the system identifies a 

20 situation where a decision is difficult, e.g., where the data analysis does not output any selected 
actions which will likely correspond to the user desires or preferences, or where ex post facto the 
user indicates that an inappropriate choice was made, the particular data structures may be stored 
and abstracted for later presentation to the user. In this case, such structures are presented by the 
system to the user, during a training session, to train the system relating to the desired response 

25 to particular data environments. In this way, the user is not necessarily burdened with training 
tasks during normal use of the device, and opportunities for such training are not lost. Where the 
system is untrained, and an "intelligent" response or mode of operation cannot be resolved, a 
default mode of operation may be defined. Further, such a default mode is preferably always 
available, at the request of the user, thus allowing use where an adaptive system is undesired or 

30 difficult to employ. 
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In a television application, the Internet appliance preferably has access to an electronic 
program guide (EPG). Such EPG systems are known, and typically provide an efficient staring 
point for user programming. These EPG may be provided as an embedded signal in a broadcast 
stream, through a dial-up network, through the internet, or on distribution media, such as CD- 
5 ROM, OCR scanning of TV-Guide (or the like) or other known means. EPGs contain a concise 
semantic description of program content, which typically is both sufficient for user evaluation, 
and brief enough for rapid evaluation. The system may therefore analyze user preferences in this 
semantic space and provide adaptive presentation of elements of the EPG to the user. Of course, 
a media data stream analysis embodiment of the invention, as disclosed above, may be used in 

10 conjunction with or in lieu of the EPG system. 

The system preferably maintains an updated index of available data. Thus, newly 
acquired data is added to the index, and deleted data is purged from the index. The system 
preferably compares new data to previously encountered data, to avoid redundant processing. 
For example, the system preferably recognizes events/programs that have previously been 

15 recorded, and checks to determine whether they are still in the index. In this context, the user is 
preferably provided with low-level file maintenance tools, for example to manually control the 
addition or deletion of data, which is then correctly represented in the index. 

Because the Internet appliance is connected to the Internet, so-called multicasts may be 
monitored for correspondence with user preferences. Therefore, it is understood that the 

20 operation of the present invention is not limited to traditional television broadcasts, and that 
streaming video and audio, as well as stored images, sound files (e.g., MIDI, MP3, A2B, 
RealAudio), text, and multimedia streams may be analyzed based on the adaptive principles 
presented herein. 

The system may also integrate Internet data with other types of data, for example 
25 providing access to stored or static data corresponding to a data stream. The retrieval and storage 
of such data may also be adaptively controlled in accordance with the present invention. Thus, it 
is expressly understood that the intelligent device may act as a "VCR" (albeit not necessarily 
employing a known type of videocassette tape), to record media. 

The Internet appliance may also operate autonomously, capturing data which corresponds 
30 to user preferences and profiles, thus reducing latency for the user, and potentially shifting data 
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transfers to off-peak periods. Such a system operates. in this mode as a so-called "agent" system. 
Likewise, the device may also be linked to other intelligent devices, to provide an intelligent 
interaction therebetween. 

The preferred user interface maintains user levels constant over long periods, i.e., not 
rapidly adaptive, to allow for quick accessing over a low bandwidth connection, such as a 
telephone, or using succinct displays, such as might be found on a personal digital assistant. 
Thus, the user can rely on memory of the interface functionality and layout to reduce data 
transmissions and reduce search time. In one embodiment, the interface may be "forced" to a 
particular type, as either a permanent interface, or as a starting point for adaptivity. Thus, the 
user may be provided with an interface design mode of operation. 

The user interaction with each "device", which may be real or virtual (implemented as a 
software construct in a relatively general purpose computer), is preferably carefully designed for 
each device. A common user interface paradigm is preferably provided for corresponding 
functions, while the user interface is preferably optimized for dealing with the specific functions 
of each particular device. Thus, a similar user interface and screen layout is employed for 
functions that are the same across a variety of devices. In this regard, it is an aspect of an 
embodiment of the invention to translate user interface systems, even in a high level state, to 
other forms. Thus, in a multi-brand environment, related components may have native interfaces 
which are both well developed and distinctly different. Therefore, the present invention allows 
for a translation or remapping of the functionality into a common paradigm. Where aspects 
cannot be adequately translated, the native interface may be presented to the user. 

It should be understood that the preferred embodiments and examples described herein 
are for illustrative purposes only and are not to be construed as limiting the scope of the present 
invention, which is properly delineated only in the appended claims. 
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